Welcome to the Cumulus Support forum.

Latest Cumulus MX V4 release 4.4.2 (build 4085) - 12 March 2025

Latest Cumulus MX V3 release 3.28.6 (build 3283) - 21 March 2024

Legacy Cumulus 1 release 1.9.4 (build 1099) - 28 November 2014
(a patch is available for 1.9.4 build 1099 that extends the date range of drop-down menus to 2030)

Download the Software (Cumulus MX / Cumulus 1 and other related items) from the Wiki

If you are posting a new Topic about an error or if you need help PLEASE read this first viewtopic.php?p=164080#p164080

CMX crashes and web uploading

From build 3044 the development baton passed to Mark Crossley. Mark has been responsible for all the Builds since. He has made the code available on GitHub. It is Mark's hope that others will join in this development, but at the very least he welcomes your ideas for future developments (see Cumulus MX Development suggestions).

Moderator: mcrossley

Post Reply
User avatar
billy
Posts: 260
Joined: Mon 30 Nov 2015 10:54 am
Weather Station: WLL / Davis VP2+
Operating System: RPi-4 bookworm
Location: Gooseberry Hill, Western Australia

CMX crashes and web uploading

Post by billy »

Just reporting that my CMX has crashed three times in the last 3 weeks. All seem related to uploads to the web server, so maybe this is the same issue as in viewtopic.php?p=172284#p172284 but I'm not sure of this, hence this separate thread.

On the first two occasions … 13 and 29 June, see the first two diags files … the crashes were associated with uploads in the early hours of the morning. It may be of interest that typically around 00:30, on many/most mornings, a MySQL upload fails, an alarm is triggered, but usually the data do end up being uploaded. I have assumed/guessed this “routine” problem was triggered by a the web server issue that was, maybe, not handled smoothly by CMX. Though note the 29 June crash seems to have started with a PHP upload failure.

The third “crash”, again in the early hours of the morning ... this morning 04 July, see third diag file ... was a little different but was again associated with web uploading. The timeline seems to be this:

00:31:32 MySQL uploading error. ie the “regular” occurrence that “seems” benign, but there are other uploading errors occurring in the early hours of the morning, none seemingly fatal.

02:03:14 This error seems more significant. From this point on, uploads are unsuccessful - all of them.

04:09:07 Something more sinister happens here and it is the last entry in the diag file for about 2 hours until:

06:11:08 Exiting system due to external SIGTERM signal ... that's me attempting a restart. This didn't "seem" to work so I rebooted the rpi. I was in a hurry, so maybe didn't give it enough time.

I noticed that during this two hour gap, the monthly file continues to be updated. But the “RecentData” table in cumulusmx.db goes from an entry every minute to once every 5 minutes. (I guess the RecentData table is where CMX gets it's one minute data for the recent graphs?)

I've only picked the eyes out of the diag files, so maybe those with a better understanding will see significance in things I haven't paid attention to. Anyhow I would be grateful for insights. From my experience crashes are rare so I don't have much experience in reading these tea leaves.

Edit: and I forgot to add that WLL stopped reporting during some of the time this morning
You do not have the required permissions to view the files attached to this post.
Last edited by billy on Tue 04 Jul 2023 10:53 am, edited 1 time in total.
User avatar
HansR
Posts: 6926
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bookworm
Location: Wagenborgen (NL)
Contact:

Re: CMX crashes and web uploading

Post by HansR »

A short remark: it seems to be a network error with some duration and it seems to be once a week. You might ask your provider if that can be confirmed.

I had an outage of 2.5 hrs during the night - only once since I started using PHP upload - at some point which was definitely related to a modem and network software update on the provider side. Apparently the PHP upload is more sensitive for that than the FTP subsystem. CMX and my CUtils failed any time they tried to connect but CMX continued and kept storing data (I use no MySQL so no true crash occurred). Then when the network came back on everything in turn restored and the whole system recovered withing 5 minutes.

1) My guess from what I read in your first logfile is that the CMX error handling recovery of the MySQL part need some additions/improvements.
2) I don't see an error in the 4/7/23 logfile => wrong logfile (of the restart after the error?)
Hans

https://meteo-wagenborgen.nl
CMX build 4070+ ● RPi 4B ● Linux 6.6.62+rpt-rpi-v8 aarch64 (bookworm) ● dotnet 8.0.1
BlueSky: https://bsky.app/profile/wagenborgenwx.bsky.social
User avatar
billy
Posts: 260
Joined: Mon 30 Nov 2015 10:54 am
Weather Station: WLL / Davis VP2+
Operating System: RPi-4 bookworm
Location: Gooseberry Hill, Western Australia

Re: CMX crashes and web uploading

Post by billy »

Hans,
Thanks for pointing out the incorrect upload :groan: The original post now has the correct one. I'd be interested to know your view of what that might tell us.
The problem is not regular enough to be classified as "weekly".
There were some more-than-usual network issues last night - as evidenced by the WLL issues I alluded to, although that may have been a WL issue.
I will go back to the web service provider. but will wait for Mark's view of this before taking that step.
User avatar
HansR
Posts: 6926
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bookworm
Location: Wagenborgen (NL)
Contact:

Re: CMX crashes and web uploading

Post by HansR »

billy wrote: Tue 04 Jul 2023 11:09 am Thanks for pointing out the incorrect upload :groan: The original post now has the correct one. I'd be interested to know your view of what that might tell us.
Well, I am far from an expert on WLL but my first reaction is the errors start when at:

Code: Select all

2023-07-03 10:09:26 No broadcast data received from the WLL for 30 seconds
it has its first real error after multiple WLL: Missed a WLL broadcast message messages. When it misses again 30 seconds later it sends an alarm mail to the user.

This indicates a network error either failing the external or internal network or the WLL itself. I have no idea.
Then it recovers and continues until:

Code: Select all

2023-07-03 14:24:54.067 PHP[225]: Error uploading to realtimegauges.txt - Exception Type: System.Net.Http.HttpRequestException
Message: An error occurred while sending the request.
Inner Exception... 
Exception Type: System.IO.IOException
Message: Unable to read data from the transport connection: Connection reset by peer.
Inner Exception... 
Exception Type: System.Net.Sockets.SocketException
Message: Connection reset by peer
where it fails on the sockets to the webserver.

It recovers again with failing WLL broadcasts.

Faling again WLL get broadcast at: 2023-07-03 19:14:57 & 2023-07-03 20:05:27

Then at :

Code: Select all

2023-07-04 00:29:41.027 PHP[84]: Upload to realtimegauges.txt: Response text follows:
Error: TimeStamp is out of date
Data TS   = 1688401764
Server TS = 1688401780
meaning either network or the processing on server side is too slow. It recovers.

Then at:

Code: Select all

2023-07-04 00:31:32.727 CustomSqlMins[0]: Error encountered during MySQL operation = One or more errors occurred. (Unable to read data from the transport connection: Operation on non-blocking socket would block.)
from which it seems to recover until :

Code: Select all

2023-07-04 02:03:14.859 PHP[202]: Error uploading to realtimegauges.txt - Exception Type: System.Threading.Tasks.TaskCanceledException
Message: The operation was canceled.
Inner Exception... 
Exception Type: System.ObjectDisposedException
Message: Cannot access a disposed object.
Object name: 'MobileAuthenticatedStream'.
after which the network seems to fail continuously with No route to Host as the main reason. Which seems it cannot access the DNS. This eventually goes to network unreachable. Within this error situation apparently it seems to recover but it does not really and it keeps hanging in

Code: Select all

2023-07-04 02:16:54.902 Realtime[2]: Warning, a previous cycle is still processing local files. Skipping this interval.
etc...

So to summarize: the network starts failing, degrades and CMX loses track. In the end no recovery possible. So when using PHP upload it seems a more strict recovery scheme would be useful but we have to remind that the network may be out for longer periods out of CMX influence. I assume the local network may continue but as I experienced personally, if they start updating the modem remotely that may not be the case.

That is somewhat my view of it might tell us ;) .
Hans

https://meteo-wagenborgen.nl
CMX build 4070+ ● RPi 4B ● Linux 6.6.62+rpt-rpi-v8 aarch64 (bookworm) ● dotnet 8.0.1
BlueSky: https://bsky.app/profile/wagenborgenwx.bsky.social
User avatar
billy
Posts: 260
Joined: Mon 30 Nov 2015 10:54 am
Weather Station: WLL / Davis VP2+
Operating System: RPi-4 bookworm
Location: Gooseberry Hill, Western Australia

Re: CMX crashes and web uploading

Post by billy »

Code: Select all

2023-07-03 10:09:26 No broadcast data received from the WLL for 30 seconds
I suspect this particular instance is not of great significance as (1) it occurs often in my system - usually a couple of times a day; and (2) this particular case is about 14 hours before things went haywire. Mind you, I have assumed it is a problem with my local network and thought it might be time to replace my modem/router.
User avatar
HansR
Posts: 6926
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bookworm
Location: Wagenborgen (NL)
Contact:

Re: CMX crashes and web uploading

Post by HansR »

billy wrote: Tue 04 Jul 2023 12:55 pm

Code: Select all

2023-07-03 10:09:26 No broadcast data received from the WLL for 30 seconds
I suspect this particular instance is not of great significance as (1) it occurs often in my system - usually a couple of times a day; and (2) this particular case is about 14 hours before things went haywire. Mind you, I have assumed it is a problem with my local network and thought it might be time to replace my modem/router.
Agree it is not of great significance (bout found if searching for error 8-) )

It seems to me at a certain point the socket used for the PHP upload fails - is there a time limit on open http connections? I would not be surprised -and should be restarted. I start my CMX automatically every 24 hrs (because of a backup) so that might be why it is not happening in my system.

Anyway, enough babbling...
Hans

https://meteo-wagenborgen.nl
CMX build 4070+ ● RPi 4B ● Linux 6.6.62+rpt-rpi-v8 aarch64 (bookworm) ● dotnet 8.0.1
BlueSky: https://bsky.app/profile/wagenborgenwx.bsky.social
User avatar
mcrossley
Posts: 14388
Joined: Thu 07 Jan 2010 9:44 pm
Weather Station: Davis VP2/WLL
Operating System: Bullseye Lite rPi
Location: Wilmslow, Cheshire, UK
Contact:

Re: CMX crashes and web uploading

Post by mcrossley »

Odd, the CustomMySqlMins function fully encloses the MySQL call in a try, and you can see that it catches the error and reports it quite a few times, then the unhandled exception seems to occur in Mono....

Code: Select all

2023-06-13 00:22:00.496 CustomSqlMins[0]: MySQL executing - INSERT IGNORE INTO realtime (LogDateTime,temp,hum,dew,wspeed,rrate,rfall,press,intemp,inhum,wchill,wgust,heatindex,UV,SolarRad,avgbearing,apptemp,CurrentSolarMax,IsSunny,IsSunUp,FeelsLike,pm2p5,pm10,pm2p5_1hr) Values('23-06-13 00:22:00',11.7,92,10.5,0,0.2,1.0,1023.6,17.2,60,11.7,3,11.7,0.0,0,'172',11.8,0,'0','0','11.8','7.1','7.9','7.6'); DELETE FROM realtime WHERE LogDateTime < DATE_SUB(CONVERT_TZ(UTC_TIMESTAMP(),'+00:00','+08:00'), INTERVAL 7 DAY)

> a little while later that command fails, the error is caught and reported

2023-06-13 00:22:38.305 CustomSqlMins[0]: Error encountered during MySQL operation = One or more errors occurred. (Unable to read data from the transport connection: Operation on non-blocking socket would block.)

> the same command is immediately sent again - I don't understand this yet, there is no retry in the code

2023-06-13 00:22:38.305 CustomSqlMins[0]: SQL = INSERT IGNORE INTO realtime (LogDateTime,temp,hum,dew,wspeed,rrate,rfall,press,intemp,inhum,wchill,wgust,heatindex,UV,SolarRad,avgbearing,apptemp,CurrentSolarMax,IsSunny,IsSunUp,FeelsLike,pm2p5,pm10,pm2p5_1hr) Values('23-06-13 00:22:00',11.7,92,10.5,0,0.2,1.0,1023.6,17.2,60,11.7,3,11.7,0.0,0,'172',11.8,0,'0','0','11.8','7.1','7.9','7.6'); DELETE FROM realtime WHERE LogDateTime < DATE_SUB(CONVERT_TZ(UTC_TIMESTAMP(),'+00:00','+08:00'), INTERVAL 7 DAY)

> Then there are two unhandled exception errors

2023-06-13 00:22:38.306 !!! Unhandled Exception !!!
2023-06-13 00:22:38.307 !!! Unhandled Exception !!!

> And CMX prints its handling of the error from the second try - the MySQL object no longer exists

2023-06-13 00:22:38.308 CustomSqlMins[0]: Error - Object reference not set to an instance of an object.

> then the unhandled exception details at the end of the log

System.AggregateException: One or more errors occurred. (Unable to read data from the transport connection: Operation on non-blocking socket would block.) ---> System.IO.IOException: Unable to read data from the transport connection: Operation on non-blocking socket would block. ---> System.Net.Sockets.SocketException: Operation on non-blocking socket would block
...
User avatar
billy
Posts: 260
Joined: Mon 30 Nov 2015 10:54 am
Weather Station: WLL / Davis VP2+
Operating System: RPi-4 bookworm
Location: Gooseberry Hill, Western Australia

Re: CMX crashes and web uploading

Post by billy »

Thanks Hans & Mark,

Guess it's time to do a complete refresh of the rpi .... and I'll go back to the web server provider to see if they have detected anything untoward.

I now have some independent evidence that my internet connection was having difficulties duirng the hours prior to the third "crash", so I'm going to pretend that one didn't happen :roll:
Post Reply