Page 1 of 2
Sharing violation on local copy
Posted: Thu 02 Jul 2020 11:40 am
by freddie
My computer that runs MX came to a grinding halt this morning. Load was something ridiculous like 20, and it was using a massive amount of memory. There did seem to be a large number of CumulusMX threads running too. I shut down MX cleanly (it took an age to reattach to the screen session and almost as long for the CTRL-C to register) and then rebooted. It came up cleanly and I restarted MX without issue.
Looking in the MXDiags I notice that I started to see "Sharing violation" messages about half an hour before I shut MX down. See attached.
I don't do any FTP/SFTP from my setup - everything is local copies from the extra files list. Only 4 files are realtime. The realtime interval is 5 seconds.
Running MX version 3.6.10 build 3086. I upgraded to this build on Saturday.
Re: Sharing violation on local copy
Posted: Thu 02 Jul 2020 1:57 pm
by mcrossley
It looks like it started going wrong before that log starts as the real time cycles are already overlapping - 225 and 226 are both running together. And then is just gets worse. Each real time cycle takes longer than the interval, so they overlap more and more. Eventually two cycles try and access the same file at the same time and you get a sharing violation.
At 2020-06-30 06:56:54 for instance, eight real time cycles are running concurrently.
I coded is so that concurrent real time cycles would not FTP at the same time as I assumed that would be the element that took a long time. But it appears your file copies are taking a long time. Are you copying to a network destination? I wouldn't expect local copies to take any time at all.
Re: Sharing violation on local copy
Posted: Thu 02 Jul 2020 3:07 pm
by freddie
Ah okay, I didn't realise the problem had started before that point - I just found the first occurrence of the word "violation" in the log and posted the log from a few minutes before that point. Let me know if including an earlier part of the log would be beneficial. The original was massive (200 MB) so needed to truncate it a bit.
The file copy is completely local - no network copy, all on the same hard disk even.
Re: Sharing violation on local copy
Posted: Thu 02 Jul 2020 3:56 pm
by freddie
It looks as if the cycle can complete in milliseconds, as you would anticipate. Here's the first one:
Code: Select all
2020-06-26 11:08:43.388 Realtime[0]: Start cycle
2020-06-26 11:08:43.400 Realtime[0]: Creating realtime.txt
2020-06-26 11:08:43.403 Realtime[0]: Processing extra file[0] - /home/install/CumulusMX/web/cumulusT.xml
2020-06-26 11:08:43.528 Data 3: 4C-4F-4F-EC-00-33-08-FD-73-16-03-38-FF-02-00-00-E5-00-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-49-FF-FF-FF-FF-FF-FF-FF-00-00-FF-FF-7F-03-00-14-6D-03-00-04-00-04-00-00-00-00-00-00-00-FF-FF-FF-FF-FF-FF-FF-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-20-03-03-C0-5E-01-F6-07-0A-0D-15-AC
2020-06-26 11:08:43.611 Realtime[0]: Copying extra file[0] - /home/install/CumulusMX/web/cumulusT.xmltmp
2020-06-26 11:08:43.611 Realtime[0]: Processing extra file[1] - /home/install/CumulusMX/web/realtimeT.xml
2020-06-26 11:08:43.933 Realtime[0]: Copying extra file[1] - /home/install/CumulusMX/web/realtimeT.xmltmp
2020-06-26 11:08:43.933 Realtime[0]: Processing extra file[2] - /home/install/CumulusMX/web/realtimegaugesT.txt
2020-06-26 11:08:43.936 Realtime[0]: Copying extra file[2] - /home/install/CumulusMX/web/realtimegaugesT.txttmp
2020-06-26 11:08:43.937 Realtime[0]: Processing extra file[3] - /home/install/CumulusMX/realtime.txt
2020-06-26 11:08:43.937 Realtime[0]: Copying extra file[3] - /home/install/CumulusMX/realtime.txttmp
2020-06-26 11:08:43.937 Realtime[0]: End cycle
The time to process extra file index 1 (realtimeT.xml) seems to be much higher than the other files - 300 milliseconds compared to 1 millisecond. This difference seems to be fairly consistent over time. That is, until cycle 183:
Code: Select all
2020-06-30 06:46:55.870 Realtime[183]: Start cycle
2020-06-30 06:46:55.874 Realtime[183]: Creating realtime.txt
2020-06-30 06:46:55.945 Realtime[183]: Processing extra file[0] - /home/install/CumulusMX/web/cumulusT.xml
2020-06-30 06:46:56.082 Realtime[183]: Copying extra file[0] - /home/install/CumulusMX/web/cumulusT.xmltmp
2020-06-30 06:46:56.087 Realtime[183]: Processing extra file[1] - /home/install/CumulusMX/web/realtimeT.xml
2020-06-30 06:46:56.983 Data 3: 4C-4F-4F-00-00-AD-09-EA-73-70-02-3E-19-02-00-00-C6-00-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-57-FF-FF-FF-FF-FF-FF-FF-00-00-FF-FF-7F-00-00-FF-FF-00-00-5B-00-5B-00-00-00-00-00-00-00-FF-FF-FF-FF-FF-FF-FF-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-22-03-03-C0-60-01-F5-07-0A-0D-D8-F5
2020-06-30 06:46:59.022 Data 4: 4C-4F-4F-00-00-AD-09-EA-73-70-02-3E-19-02-00-00-C6-00-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-57-FF-FF-FF-FF-FF-FF-FF-00-00-FF-FF-7F-00-00-FF-FF-00-00-5B-00-5B-00-00-00-00-00-00-00-FF-FF-FF-FF-FF-FF-FF-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-22-03-03-C0-60-01-F5-07-0A-0D-D8-F5
2020-06-30 06:47:00.940 Realtime[184]: Start cycle
2020-06-30 06:47:00.991 Realtime[184]: Creating realtime.txt
2020-06-30 06:47:01.023 Data 5: 4C-4F-4F-00-00-AE-09-EA-73-70-02-3E-19-02-00-00-D9-00-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-57-FF-FF-FF-FF-FF-FF-FF-00-00-FF-FF-7F-00-00-FF-FF-00-00-5B-00-5B-00-00-00-00-00-00-00-FF-FF-FF-FF-FF-FF-FF-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-22-03-03-C0-60-01-F5-07-0A-0D-BD-16
2020-06-30 06:47:01.077 Writing log entry for 30/06/2020 06:47:00
2020-06-30 06:47:01.077 max gust: 0.0
2020-06-30 06:47:01.159 Written log entry for 30/06/2020 06:47:00
2020-06-30 06:47:01.159 Realtime[184]: Processing extra file[0] - /home/install/CumulusMX/web/cumulusT.xml
2020-06-30 06:47:01.224 Writing today.ini, LastUpdateTime = 30/06/2020 06:47:00 raindaystart = 18.2 rain counter = 18.2
2020-06-30 06:47:01.312 Realtime[184]: Copying extra file[0] - /home/install/CumulusMX/web/cumulusT.xmltmp
2020-06-30 06:47:01.358 Realtime[184]: Processing extra file[1] - /home/install/CumulusMX/web/realtimeT.xml
2020-06-30 06:47:01.461 Realtime[183]: Copying extra file[1] - /home/install/CumulusMX/web/realtimeT.xmltmp
2020-06-30 06:47:01.487 Realtime[183]: Processing extra file[2] - /home/install/CumulusMX/web/realtimegaugesT.txt
2020-06-30 06:47:01.502 Realtime[184]: Copying extra file[1] - /home/install/CumulusMX/web/realtimeT.xmltmp
2020-06-30 06:47:01.507 Realtime[184]: Processing extra file[2] - /home/install/CumulusMX/web/realtimegaugesT.txt
2020-06-30 06:47:01.516 Realtime[183]: Copying extra file[2] - /home/install/CumulusMX/web/realtimegaugesT.txttmp
2020-06-30 06:47:01.517 Realtime[183]: Processing extra file[3] - /home/install/CumulusMX/realtime.txt
2020-06-30 06:47:01.524 Realtime[184]: Copying extra file[2] - /home/install/CumulusMX/web/realtimegaugesT.txttmp
2020-06-30 06:47:01.527 Realtime[183]: Copying extra file[3] - /home/install/CumulusMX/realtime.txttmp
2020-06-30 06:47:01.527 Realtime[184]: Processing extra file[3] - /home/install/CumulusMX/realtime.txt
2020-06-30 06:47:01.536 Realtime[184]: Copying extra file[3] - /home/install/CumulusMX/realtime.txttmp
2020-06-30 06:47:01.536 Realtime[184]: End cycle
2020-06-30 06:47:01.528 Realtime[183]: End cycle
which is the first instance of overlap. It looks as if processing realtimeT.xml took just under 5.5 seconds on that cycle! On the next "standalone" cycle (185) it took 800 milliseconds - still quite slow compared to the other files, but much quicker than 5.5 seconds. By cycle 187 it is down to 500 milliseconds.
Then something odd happens to the cycles. 189 started at 06:47:25, so you would anticipate 190 starting at 06:47:30 - it was a bit later than this at 06:47:38. Then cycle 191 started 06:47:38 as well - some 7 milliseconds later. 192 started as expected at :42, 193 at :47. 194 was late at :57. 195 much later at 06:51:17 - at which time most of the other overlapping cycles had ended (just 193 outstanding which finished later at 06:51:43).
The choke point does appear to be the processing of realtimeT.xml - which takes much longer than the other files in general, but at times takes a substantially long time (4-5 seconds).
Re: Sharing violation on local copy
Posted: Thu 02 Jul 2020 4:51 pm
by mcrossley
Interesting, could you post the template for that file, I try adding to my Dev env and see if I can spot anything going wrong.
Re: Sharing violation on local copy
Posted: Thu 02 Jul 2020 8:03 pm
by freddie
Sure. Here it is.
Re: Sharing violation on local copy
Posted: Thu 02 Jul 2020 9:13 pm
by mcrossley
Thanks, 604 web tokens in that file - quite a few for a real time file!
I split my data, a smaller file with the common faster changing stuff (wind, rain, temp, hum, feels like etc), then a much larger file with all the rest in it (records, highs, lows, etc), which only gets updated at my 5 minute update interval.
Anyway, the web token parser was taking 100ms to process the file on my laptop, I've changed the code and it's now just less than 19ms, so that's a useful optimisation around 5x faster, and lower CPU utilisation.
Not yet found a reason why it would take multiple seconds though. The first process after starting always seems to take about 2 seconds, then every one after that was around 100ms.
Re: Sharing violation on local copy
Posted: Thu 02 Jul 2020 9:23 pm
by freddie
Yes, the file definitely needs some of the tags taken out - something I've been meaning to do for a long time. However, I have used it for years without a problem - something like 7 years - so I thought I should mention it as it seems to give the latest build a problem.
Thanks for taking a look and for optimising the code - I look forward to the next build. In the meantime I will keep a close eye to try and catch the current release should it run into problems again.
Re: Sharing violation on local copy
Posted: Fri 03 Jul 2020 5:46 am
by HansR
Listening in and hearing the problem, it started to make me wonder: maybe it has something to do with not releasing memory (keeping some list assigned to variable) which inhibits the garbage collector to kick in. Then after a while I would expect it to start swapping and that in turn would definitely at some point definitely create timing difficulties as described.
At least monitoring the system would be interesting if you can not find a direct cause in the code.
Re: Sharing violation on local copy
Posted: Fri 03 Jul 2020 8:14 am
by sfws
mcrossley wrote: ↑Thu 02 Jul 2020 9:13 pm
604 web tokens in that file - quite a few for a real time file!
(By the way on terminology; why do the web tags of Cumulus 1 become web tokens in MX?).
freddie wrote: ↑Thu 02 Jul 2020 9:23 pm
Yes, the file definitely needs some of the tags taken out - something I've been meaning to do for a long time. However, I have used it for years without a problem - something like 7 years
It is a reflection of
how much Cumulus MX does as it currently produces nine and a half million tokens that are mostly derived from between half a dozen and a dozen basic measurements put out by a weather station! (717 discrete web tokens and there are more than 605 thousand different combinations that could be specified for each recent history token, adding over 9 million tokens, similarly each monthly all time token takes 13 input parameters adding another 702 combinations).
From a quick scan, around 350 of those are likely to be updated so frequently that they should be uploaded at a real-time interval.
The Cumulus web tags to PHP script that I use (and post on Wiki) intended for
real time uploading frequency has 377 variables being uploaded (it uses some parameters that imply some basic web tags are appearing more than once).
Oh and Freddie if you do revise your XML, Mark had added several new web tokens that were not available 7 years ago! So Alcaston Weather could start reporting feels like for example!
Re: Sharing violation on local copy
Posted: Fri 03 Jul 2020 2:07 pm
by mcrossley
freddie wrote: ↑Thu 02 Jul 2020 9:23 pm
Yes, the file definitely needs some of the tags taken out - something I've been meaning to do for a long time. However, I have used it for years without a problem - something like 7 years - so I thought I should mention it as it seems to give the latest build a problem.
I have made a change recently. Originally MX just tracked if a real time update was in progress, if it was it didn't run again until it completed. So you would not have seen the problem.
The change I made was that it tracked the real time FTP. If that was still in progress it skipped it, BUT it did now allow the realtime.txt file to be updated and the local file copies. As your copy is sometimes taking a long time, things are going wrong. I'll add another check for file copy in progress too!
sfws wrote: ↑Fri 03 Jul 2020 8:14 am
(By the way on terminology; why do the web tags of Cumulus 1 become web tokens in MX?).
No, they are still web tags, I called them tokens because I had been looking at the token parser code which finds the web tags in the files.
Re: Sharing violation on local copy
Posted: Mon 06 Jul 2020 6:14 pm
by freddie
It happened again this morning. Diags file attached. Oddly it was the same time of day as the previous occurrence. I can't think that anything special is happening on the host at that time. Log rotation occurs at 06:25, around 10 minutes before the problem starts. I don't think it is linked as log rotation is a daily event, and this problem isn't daily (thank goodness).
I will definitely look at the template file and do some pruning!
Re: Sharing violation on local copy
Posted: Mon 06 Jul 2020 8:06 pm
by mcrossley
Same as before isn't it. It is taking longer than the real time interval to process and copy the files so multiple cycles start running over each other until they try and write to the same file at the same time. It is non-fatal, but the next release will prevent it happening by skipping the file updates/copies for a cycle if they are still in progress (roughly how it used to work, but now with more granularity).
Re: Sharing violation on local copy
Posted: Mon 06 Jul 2020 8:27 pm
by freddie
Yes, it's exactly as before. Looking forward to the release

Re: Sharing violation on local copy
Posted: Sat 11 Jul 2020 8:46 am
by freddie
Revisiting this: the new build has stopped the interweaving of real-time updates, which is great thank you.
The root cause of the very slow processing of the realtime template actually turned out to be totally unrelated to Cumulus. I have another container running on the same host - it runs Pihole ad--blocking software. It was this that was getting it's knickers in a knot and causing resource contention on the host. I'm trying to work out exactly what triggers this - but MX certainly isn't guilty
