Welcome to the Cumulus Support forum.

Latest Cumulus MX V3 release 3.28.6 (build 3283) - 21 March 2024

Cumulus MX V4 beta test release 4.0.0 (build 4019) - 03 April 2024

Legacy Cumulus 1 release 1.9.4 (build 1099) - 28 November 2014
(a patch is available for 1.9.4 build 1099 that extends the date range of drop-down menus to 2030)

Download the Software (Cumulus MX / Cumulus 1 and other related items) from the Wiki

Weird issue after long run of CMX

From build 3044 the development baton passed to Mark Crossley. Mark has been responsible for all the Builds since. He has made the code available on GitHub. It is Mark's hope that others will join in this development, but at the very least he welcomes your ideas for future developments (see Cumulus MX Development suggestions).

Moderator: mcrossley

Post Reply
User avatar
HansR
Posts: 5957
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bookworm
Location: Wagenborgen (NL)
Contact:

Weird issue after long run of CMX

Post by HansR »

Something I have noticed before but now is also being observed by Steinar as well, is that CMX seems to become unstable after long running on the RPI. Specifically, when the system is up more than 3 weeks , I get an uptime of -49. It happens when I process the file through CMX but I get the same value using the API.

When using the API I get information in the logfile that the code - my code in cutils - goes really haywire meaning it goes to where it should not go.

It may be mono related, but it happens with the mono version 5 and 6.
A reboot always solved the issue with me, but now it becomes a bit different as Steinar can't reboot being far away from the RPi.

It may be in several different subsystems, I point to Mono (and C# libraries) first, but it could be anywhere. The fact that it happens with CMX translating webtags and the same values appear when asking through the API worries me a lot. I focus initially on the CMX uptime which becomes -49 when the problem occurs.

Any suggestions?
How to progress?
Hans

https://meteo-wagenborgen.nl
CMX build 4017+ ● RPi 3B+ ● Raspbian Linux 6.1.21-v7+ armv7l ● dotnet 8.0.3
User avatar
mcrossley
Posts: 12756
Joined: Thu 07 Jan 2010 9:44 pm
Weather Station: Davis VP2/WLL
Operating System: Bullseye Lite rPi
Location: Wilmslow, Cheshire, UK
Contact:

Re: Weird issue after long run of CMX

Post by mcrossley »

Looks like a bug in mono - wrapping a counter probably, using a signed int instead of a unsigned long as the millisecond counter? A signed int will wrap to negative values after 24 days of counting milliseconds.

One of my test systems: {"ProgramUpTime":"-132 days -23 hours"}
User avatar
mcrossley
Posts: 12756
Joined: Thu 07 Jan 2010 9:44 pm
Weather Station: Davis VP2/WLL
Operating System: Bullseye Lite rPi
Location: Wilmslow, Cheshire, UK
Contact:

Re: Weird issue after long run of CMX

Post by mcrossley »

I guess as a workaround I could add a new tag <#ProgramUpTimeMs> that would return the uptime in milliseconds. You could then do some two's complement arithmetic on that to get the real value?
User avatar
mcrossley
Posts: 12756
Joined: Thu 07 Jan 2010 9:44 pm
Weather Station: Davis VP2/WLL
Operating System: Bullseye Lite rPi
Location: Wilmslow, Cheshire, UK
Contact:

Re: Weird issue after long run of CMX

Post by mcrossley »

Or I could do it for you in the tag code

If mono *is* using an int value, that of course would still only get you up to 49 days before it wrapped back to zero again. :(
User avatar
HansR
Posts: 5957
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bookworm
Location: Wagenborgen (NL)
Contact:

Re: Weird issue after long run of CMX

Post by HansR »

Yes, no doubt we could make a trick to get around the uptime. But there is more to it because it seems none of my API calls work anymore. The logs are inconsistent.
I'll try something else with Steinar and then will come back to this.
Hans

https://meteo-wagenborgen.nl
CMX build 4017+ ● RPi 3B+ ● Raspbian Linux 6.1.21-v7+ armv7l ● dotnet 8.0.3
User avatar
HansR
Posts: 5957
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bookworm
Location: Wagenborgen (NL)
Contact:

Re: Weird issue after long run of CMX

Post by HansR »

OK, I got confirmation it is really looks like only a local problem for the uptime of Cumulus (as for the system the webtag method does not work at all anymore and I determine it myself).

However I also looked into it myself a bit and I have an other solution in CMX which works for everybody without have to rework the result in the Webtag (meaning it can be used directly if the uptime webtag is used in a webpage). I'll send the suggestion by PM.
Hans

https://meteo-wagenborgen.nl
CMX build 4017+ ● RPi 3B+ ● Raspbian Linux 6.1.21-v7+ armv7l ● dotnet 8.0.3
sutne
Posts: 377
Joined: Sun 14 Oct 2012 4:23 pm
Weather Station: HP2553 (WS80) and HP2564 (WS90)
Operating System: Raspbian Bullseye and Bookworm
Location: Rjoanddalen and Kronstad, Norway
Contact:

Re: Weird issue after long run of CMX

Post by sutne »

I have upgraded the Raspbian and then had a reboot, so the CumulusMX program uptime Is back to 0.

What I do not understand is why is not the Program uptime counter reset when I stop and start CumulusMX?
User avatar
HansR
Posts: 5957
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bookworm
Location: Wagenborgen (NL)
Contact:

Re: Weird issue after long run of CMX

Post by HansR »

Hi Steinar, don't know what you know about coding, but that does not happen because the way Cumulus asks the uptime to the system. It does not save the starttime itself but it asks the process starttime with a system call:

Code: Select all

TimeSpan ts = DateTime.Now - Process.GetCurrentProcess().StartTime;
And that is where control is lost and the error is. Mono probably, maybe Linux, some counter possibly a signed/unsigned thing. We don't know for sure. Anyway, restarting CMX does apparently not solve this. If I look at it I don't understand it either because I would say the timer would restart with a new process. Apparently in that call something does not work. We have no influence on this, and can only calculate the time difference in CMX. A workaround to the current method.

In discussion with Mark we found a workaround, so probably in the next release that will be implemented.
But he's away now so it takes some time.
Hans

https://meteo-wagenborgen.nl
CMX build 4017+ ● RPi 3B+ ● Raspbian Linux 6.1.21-v7+ armv7l ● dotnet 8.0.3
Post Reply