Welcome to the Cumulus Support forum.

Latest Cumulus MX V4 release 4.4.2 (build 4085) - 12 March 2025

Latest Cumulus MX V3 release 3.28.6 (build 3283) - 21 March 2024

Legacy Cumulus 1 release 1.9.4 (build 1099) - 28 November 2014
(a patch is available for 1.9.4 build 1099 that extends the date range of drop-down menus to 2030)

Download the Software (Cumulus MX / Cumulus 1 and other related items) from the Wiki

If you are posting a new Topic about an error or if you need help PLEASE read this first viewtopic.php?p=164080#p164080

Crash in 3101

From build 3044 the development baton passed to Mark Crossley. Mark has been responsible for all the Builds since. He has made the code available on GitHub. It is Mark's hope that others will join in this development, but at the very least he welcomes your ideas for future developments (see Cumulus MX Development suggestions).

Moderator: mcrossley

Post Reply
sfws
Posts: 1183
Joined: Fri 27 Jul 2012 11:29 am
Weather Station: Chas O, Maplin N96FY, N25FR
Operating System: rPi 3B+ with Buster (full)

Crash in 3101

Post by sfws »

I had been running 3094 quite happily on my unattended RPI since that build was released (last September), and it proved reliable month after month. I will have restarted MX a few times, but not on any regular schedule.

On 22 January, I looked at the forum, saw that Mark had been very busy developing MX and I decided to try release build 3101, on that day it was the latest available, and there was no evidence of people being annoyed with it. I did not try to learn about the new functionality of running as service, I used sudo mono CumulusMX.exe -debug as I had with previous installation here.

It seemed to be working okay when I checked it the next day, and so I left it unattended after that. It happened I was beside my RPi yesterday morning, and checked it was still running, without any issues.

I discovered this morning my MX software installation had crashed, although I don't know when. It appears the last entry in MXDiags file is yesterday afternoon, but the stack report does not mention any time, and might have been generated then or earlier this morning for all I know.

I attach, in the zip, the stack trace it showed on the console, the then current MX diags file (with debug on) that just stopped being written to without recording any reason, and a new file created when that release started running. None of them tell me why unattended MX decided to crash. The only thoughts I have are:
1) that releases happen so often that perhaps only me expects MX to keep running unattended for longer than two weeks?
2) or maybe too much functionality has been rushed into MX (I see Mark is saying he wants to take a break from adding functionality)

Anyway, I have regressed back to 3094 that did work reliably, and I rewound to rollover yesterday in case any data files were corrupted by whatever happened whenever it happened.
You do not have the required permissions to view the files attached to this post.
freddie
Posts: 2870
Joined: Wed 08 Jun 2011 11:19 am
Weather Station: Davis Vantage Pro 2 + Ecowitt
Operating System: GNU/Linux Ubuntu 24.04 LXC
Location: Alcaston, Shropshire, UK
Contact:

Re: Crash in 3101

Post by freddie »

Looks like a crash in Mono to me, rather than MX. What version of Mono are you running?

Maybe even a hardware problem?

Code: Select all

Bus error
pi@tiny-computer:/media/pi/portable/CumulusMX $ Crash Reporter has timed out, sending SIGSEGV
Freddie
Image
User avatar
mcrossley
Posts: 14388
Joined: Thu 07 Jan 2010 9:44 pm
Weather Station: Davis VP2/WLL
Operating System: Bullseye Lite rPi
Location: Wilmslow, Cheshire, UK
Contact:

Re: Crash in 3101

Post by mcrossley »

It's not an obvious crash in Cumulus, it could have been in Mono. It may be worth a check in the system logs.
User avatar
HansR
Posts: 6926
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bookworm
Location: Wagenborgen (NL)
Contact:

Re: Crash in 3101

Post by HansR »

If it is a mono crash you will find a file containing that info in the cumulus directory.
I don't remember the naming but it is unmistakenly a mono crash dump info file.
Hans

https://meteo-wagenborgen.nl
CMX build 4070+ ● RPi 4B ● Linux 6.6.62+rpt-rpi-v8 aarch64 (bookworm) ● dotnet 8.0.1
BlueSky: https://bsky.app/profile/wagenborgenwx.bsky.social
sfws
Posts: 1183
Joined: Fri 27 Jul 2012 11:29 am
Weather Station: Chas O, Maplin N96FY, N25FR
Operating System: rPi 3B+ with Buster (full)

Re: Crash in 3101

Post by sfws »

I was out in today's mist still weeding the garden, so I did not read your replies until this evening.
In brief, following advice I upgraded Mono, I also upgraded to b. 3107, and the new combination is so far running problem free.
mcrossley wrote: Sat 06 Feb 2021 11:41 am It's not an obvious crash in Cumulus, it could have been in Mono. It may be worth a check in the system logs.
freddie wrote: Sat 06 Feb 2021 11:32 am Looks like a crash in Mono to me, rather than MX. What version of Mono are you running?
HansR wrote: Sat 06 Feb 2021 11:58 am If it is a mono crash you will find a file containing that info in the cumulus directory.
I don't remember the naming but it is unmistakably a mono crash dump info file.
@HansR: I looked in CumulusMX directory, but did not see any file not derived from the release. Maybe I should search elsewhere for a mono crash dump file ?

@Freddie: I just stopped 3094 MX, then tried an upgrade to latest mono, it did update many components, so my RPi had been running an older version. In the 12 hours since I reverted to the old 3094 MX release this morning, it and the old mono worked happily together, just as had happened in September-January. But now I have got mono to be definitely up to date, I will install up-to-date MX too before I restart.
The older MX release has never suggested any hardware issues, and manually accessing my hardware does not report any problems. I am too tired now for any further investigation.

@mcrossley: The various logs in /var/log did have extra entries from yesterday afternoon, these continued last night. The system log worried me in the little time I spent looking at it. Many of the lines, after that sudden change, were talking about new (being assigned incrementing numbers) users joining (without explaining why! There were a few references to devices by numbers that also seemed to be incrementing. Could mono have been creating the new users each time MX did some access, or could an intermittent hardware issue make each access get treated as fresh (?), or was someone breaking into my LAN while I slept last night. Again, I can try investigating further, including looking at older logs, when I am not yawning.
freddie
Posts: 2870
Joined: Wed 08 Jun 2011 11:19 am
Weather Station: Davis Vantage Pro 2 + Ecowitt
Operating System: GNU/Linux Ubuntu 24.04 LXC
Location: Alcaston, Shropshire, UK
Contact:

Re: Crash in 3101

Post by freddie »

@sfws if you like, you could PM me the appropriate part of your syslog and I will take a look (it's my day job).
Freddie
Image
User avatar
HansR
Posts: 6926
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bookworm
Location: Wagenborgen (NL)
Contact:

Re: Crash in 3101

Post by HansR »

@sfws: no, I don't think so as CMX is running in that directory. I never found them elsewhere.
Hans

https://meteo-wagenborgen.nl
CMX build 4070+ ● RPi 4B ● Linux 6.6.62+rpt-rpi-v8 aarch64 (bookworm) ● dotnet 8.0.1
BlueSky: https://bsky.app/profile/wagenborgenwx.bsky.social
User avatar
mcrossley
Posts: 14388
Joined: Thu 07 Jan 2010 9:44 pm
Weather Station: Davis VP2/WLL
Operating System: Bullseye Lite rPi
Location: Wilmslow, Cheshire, UK
Contact:

Re: Crash in 3101

Post by mcrossley »

Are the "new" just related to you viewing the admin pages. Each page load is logged as a new client connection as the static content does not maintain persistent connections.
sfws
Posts: 1183
Joined: Fri 27 Jul 2012 11:29 am
Weather Station: Chas O, Maplin N96FY, N25FR
Operating System: rPi 3B+ with Buster (full)

Re: Crash in 3101

Post by sfws »

Just spotted MXdiags file reports a rainfall rate field with format error in dayfile.txt line for 23rd March 2017. I was running Cumulus 1, (2 homes ago) when that line was created. Pity the admin interface datalog editing page for dayfile does not let you pick which lines to show as the correction was approximately half-way between start and end (on p216 of 368 pages), and getting to that page is therefore over hundred clicks.
(Incidentally, the format would have been acceptable in Cumulus 1, but I don't believe it was stored like that, I suspect subsequent corruption has made it unacceptable for MX).

I now realise, reading whole dayfile.txt is something that has changed between the old release I was running, and newer releases. I take it MX now stores a duplicate of the whole log file somewhere in memory, that is a lot of extra i/o operations when like me you have over 1 1/4 decades stored in that file. I also see there are a lot more .json files in /web and although I don't think I'm uploading them, just generating them represents a huge increase in i/o actions. Perhaps I will go back to a simpler MX release, abandoning this bloatware. One of the strengths of the original Cumulus software was its https://cumuluswiki.org/a/FAQ#What_is_t ... _update.3F small uploads.

freddie wrote: Sat 06 Feb 2021 11:32 am Maybe even a hardware problem?
mcrossley wrote: Sun 07 Feb 2021 3:10 pm Are the "new" just related to you viewing the admin pages
Yes Mark, I wanted to check overnight low temperature before restarting my gardening, so I did use admin interface on my mobile phone that morning, and I did navigate between pages. It was seeing admin interface pages with blanks instead of figures that made me discover crash. Thinking about it, although the MX diags file ended the previous afternoon, MX was partly still working on RPi to generate web server to load the admin pages that morning on my mobile, but the api was not populating pages with figures, suggesting an i/o failure (ta Niall) potentially caused crash. Anyway, I'm glad it is not an outsider hacking in!
water01
Posts: 3670
Joined: Sat 13 Aug 2011 9:33 am
Weather Station: Ecowitt HP2551
Operating System: Windows 10/11 64bit Synology NAS
Location: Burnham-on-Sea
Contact:

Re: Crash in 3101

Post by water01 »

You can now choose which .json files to upload for the graphs.
David
Image
sfws
Posts: 1183
Joined: Fri 27 Jul 2012 11:29 am
Weather Station: Chas O, Maplin N96FY, N25FR
Operating System: rPi 3B+ with Buster (full)

Re: Crash in 3101

Post by sfws »

sfws wrote: Sun 07 Feb 2021 10:41 pm there are a lot more .json files in /web and although I don't think I'm uploading them,
water01 wrote: Sun 07 Feb 2021 11:11 pm You can now choose which .json files to upload for the graphs.
Read the last paragraph of the release announcement again, and then read the quote from my post. Put those together, and your response is not relevant.

The fact that more .JSON files are always created locally, is the cause of the considerable increase in i/o operations and related h/w wear I described.
User avatar
mcrossley
Posts: 14388
Joined: Thu 07 Jan 2010 9:44 pm
Weather Station: Davis VP2/WLL
Operating System: Bullseye Lite rPi
Location: Wilmslow, Cheshire, UK
Contact:

Re: Crash in 3101

Post by mcrossley »

sfws wrote: Mon 08 Feb 2021 7:30 am
sfws wrote: Sun 07 Feb 2021 10:41 pm there are a lot more .json files in /web and although I don't think I'm uploading them,
water01 wrote: Sun 07 Feb 2021 11:11 pm You can now choose which .json files to upload for the graphs.
Read the last paragraph of the release announcement again, and then read the quote from my post. Put those together, and your response is not relevant.

The fact that more .JSON files are always created locally, is the cause of the considerable increase in i/o operations and related h/w wear I described.
Yeah, that is a temporary situation to get around the issue of people wanting the files, but not wanting to FTP them. Long needed much finer grained control of all file output and transfer is on the way...
Post Reply