Welcome to the Cumulus Support forum.
Latest Cumulus MX V3 release 3.28.6 (build 3283) - 21 March 2024
Cumulus MX V4 beta test release 4.0.0 (build 4019) - 03 April 2024
Legacy Cumulus 1 release 1.9.4 (build 1099) - 28 November 2014
(a patch is available for 1.9.4 build 1099 that extends the date range of drop-down menus to 2030)
Download the Software (Cumulus MX / Cumulus 1 and other related items) from the Wiki
Latest Cumulus MX V3 release 3.28.6 (build 3283) - 21 March 2024
Cumulus MX V4 beta test release 4.0.0 (build 4019) - 03 April 2024
Legacy Cumulus 1 release 1.9.4 (build 1099) - 28 November 2014
(a patch is available for 1.9.4 build 1099 that extends the date range of drop-down menus to 2030)
Download the Software (Cumulus MX / Cumulus 1 and other related items) from the Wiki
Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
Moderator: mcrossley
-
- Posts: 24
- Joined: Sun 22 Nov 2015 10:43 am
- Weather Station: Davis VP2 Plus
- Operating System: CentOS
- Location: Wolumla
- Contact:
Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
Hi there
So I was running version 3122 since it was released and I recently started experiencing problems with CumulusMX losing contact with the Davis VantagePro2 weather station I have which is connected via a USB data logger. In the console log, everything appears to work fine for a random period of time, and then suddenly the log starts listing "Data input appears to have stopped". There are no other error messages or clues in the log.
My instance of 3122 had been super stable for many many months - indeed sometimes CumulusMX would run for months! I typically update the OS and associated packages every 2 or so weeks, including making sure that Mono is updated to the latest version. If I restart the cumulusmx service, it immediately starts working again, sometimes for just a day, sometimes for many days.
I upgraded to the latest version of CumulusMX and the problem continues. Given that it was super stable, my guess is that perhaps something within the Mono package has changed which might be causing the problem?
I've also tried doing a complete reboot of the machine but the same random behaviour of losing contact continues.
Any insight on what might be causing the problem would be appreciated. I'm happy to provide additional information as needed.
So I was running version 3122 since it was released and I recently started experiencing problems with CumulusMX losing contact with the Davis VantagePro2 weather station I have which is connected via a USB data logger. In the console log, everything appears to work fine for a random period of time, and then suddenly the log starts listing "Data input appears to have stopped". There are no other error messages or clues in the log.
My instance of 3122 had been super stable for many many months - indeed sometimes CumulusMX would run for months! I typically update the OS and associated packages every 2 or so weeks, including making sure that Mono is updated to the latest version. If I restart the cumulusmx service, it immediately starts working again, sometimes for just a day, sometimes for many days.
I upgraded to the latest version of CumulusMX and the problem continues. Given that it was super stable, my guess is that perhaps something within the Mono package has changed which might be causing the problem?
I've also tried doing a complete reboot of the machine but the same random behaviour of losing contact continues.
Any insight on what might be causing the problem would be appreciated. I'm happy to provide additional information as needed.
- mcrossley
- Posts: 12767
- Joined: Thu 07 Jan 2010 9:44 pm
- Weather Station: Davis VP2/WLL
- Operating System: Bullseye Lite rPi
- Location: Wilmslow, Cheshire, UK
- Contact:
Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
Your profile says you are running CentOS, you could check that USB power saving is disabled.
-
- Posts: 2477
- Joined: Wed 08 Jun 2011 11:19 am
- Weather Station: Davis Vantage Pro 2 + Ecowitt
- Operating System: GNU/Linux Ubuntu 22.04 LXC
- Location: Alcaston, Shropshire, UK
- Contact:
Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
Have you tried a different USB cable? It could be your existing one has become less reliable.
-
- Posts: 24
- Joined: Sun 22 Nov 2015 10:43 am
- Weather Station: Davis VP2 Plus
- Operating System: CentOS
- Location: Wolumla
- Contact:
Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
Thanks for the replies
Yes, initially I re-seated the USB cable and later I did try a different cable to be sure.
I am indeed running CentOS and after checking the control and autosuspend_delay_ms files for the USB port that the weather station is connected to, both are set to values which disable any power saving by the system or any application.
Yes, initially I re-seated the USB cable and later I did try a different cable to be sure.
I am indeed running CentOS and after checking the control and autosuspend_delay_ms files for the USB port that the weather station is connected to, both are set to values which disable any power saving by the system or any application.
-
- Posts: 24
- Joined: Sun 22 Nov 2015 10:43 am
- Weather Station: Davis VP2 Plus
- Operating System: CentOS
- Location: Wolumla
- Contact:
Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
The other thing that might be worth mentioning is that when CumulusMX is operating normally, the systemctl stop cumulusmx command executes very quickly (within a second). When contact with the station has been lost, the attempt to stop the service can take between 30 and 60 seconds. As mentioned previously, once stopped, the cumulusmx service starts again without any issues and without needing to reboot or kill any processes.
-
- Posts: 24
- Joined: Sun 22 Nov 2015 10:43 am
- Weather Station: Davis VP2 Plus
- Operating System: CentOS
- Location: Wolumla
- Contact:
Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
So a further update...
I happened to be working on the server when the data input appeared to stop. Looking around, one of the things I immediately noticed was that the Mono process was consuming all of the available CPU. When I stopped the CumulusMX process, the mono process also stopped and the CPU load returned to normal. When I restarted CumulusMX, the mono process also started (as expected) and was sitting at a more normal level of just a couple of percent.
There's definitely something weird going on...
I happened to be working on the server when the data input appeared to stop. Looking around, one of the things I immediately noticed was that the Mono process was consuming all of the available CPU. When I stopped the CumulusMX process, the mono process also stopped and the CPU load returned to normal. When I restarted CumulusMX, the mono process also started (as expected) and was sitting at a more normal level of just a couple of percent.
There's definitely something weird going on...
-
- Posts: 19
- Joined: Sun 27 Aug 2017 2:46 pm
- Weather Station: Davis Vantage Vue
- Operating System: Windows 10
Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
Just a bump on this: I'm seeing the same behavior on my Davis Vantage Vue now with build 3160, something that started happening with one of the most recent 2-3 builds, around the same time you started seeing this. There's nothing else of note in the logs even with debugging turned on, and nothing sent to the console about it. Restarting Cumulus resolves the issue immediately. The USB device is configured never to go to sleep to save power and has never given me any trouble before a month or two ago.
Once it gets into the "data input appears to have stopped" state, it never recovers without restarting CumulusMX.
Here's all I got in the logs:
2022-01-08 16:16:26.213 SendLoopCommand: Starting - LOOP 50
2022-01-08 16:16:26.213 WakeVP: Not required
2022-01-08 16:16:26.213 SendLoopCommand: Sending command LOOP 50, attempt 1
2022-01-08 16:16:26.214 SendLoopCommand: Wait for ACK
2022-01-08 16:16:26.214 WaitForACK: Wait for ACK
2022-01-08 16:16:26.275 WaitForACK: ACK received
2022-01-08 16:16:26.276 LOOP: 1 - Data packet is good
[ then more packets all the way up to...]
2022-01-08 16:17:42.133 LOOP: 39 - Data packet is good
2022-01-08 16:19:00.221 *** Data input appears to have stopped
2022-01-08 16:20:00.270 *** Data input appears to have stopped
2022-01-08 16:21:00.241 *** Data input appears to have stopped
(and every minute after that until I restart it)
One thing of note is that it looks like outbound updates were getting sent at the same time that a previous query was being sent, but by the time this query ran, it doesn't look like it was stepping on anything:
2022-01-08 16:15:02.133 LOOP: 8 - Data packet is good
2022-01-08 16:15:03.704 Sending: N6OL>APRS,TCPIP*[stuff]
2022-01-08 16:15:04.136 LOOP: 9 - Data packet is good
2022-01-08 16:15:06.127 LOOP: 10 - Data packet is good
2022-01-08 16:15:06.716 End of CWOP update
2022-01-08 16:15:08.136 LOOP: 11 - Data packet is good
2022-01-08 16:15:10.126 LOOP: 12 - Data packet is good
[etc]
I wonder, though, if something isn't releasing a lock or a semaphore somewhere and then causing that update thread to block forever waiting on a lock...
Once it gets into the "data input appears to have stopped" state, it never recovers without restarting CumulusMX.
Here's all I got in the logs:
2022-01-08 16:16:26.213 SendLoopCommand: Starting - LOOP 50
2022-01-08 16:16:26.213 WakeVP: Not required
2022-01-08 16:16:26.213 SendLoopCommand: Sending command LOOP 50, attempt 1
2022-01-08 16:16:26.214 SendLoopCommand: Wait for ACK
2022-01-08 16:16:26.214 WaitForACK: Wait for ACK
2022-01-08 16:16:26.275 WaitForACK: ACK received
2022-01-08 16:16:26.276 LOOP: 1 - Data packet is good
[ then more packets all the way up to...]
2022-01-08 16:17:42.133 LOOP: 39 - Data packet is good
2022-01-08 16:19:00.221 *** Data input appears to have stopped
2022-01-08 16:20:00.270 *** Data input appears to have stopped
2022-01-08 16:21:00.241 *** Data input appears to have stopped
(and every minute after that until I restart it)
One thing of note is that it looks like outbound updates were getting sent at the same time that a previous query was being sent, but by the time this query ran, it doesn't look like it was stepping on anything:
2022-01-08 16:15:02.133 LOOP: 8 - Data packet is good
2022-01-08 16:15:03.704 Sending: N6OL>APRS,TCPIP*[stuff]
2022-01-08 16:15:04.136 LOOP: 9 - Data packet is good
2022-01-08 16:15:06.127 LOOP: 10 - Data packet is good
2022-01-08 16:15:06.716 End of CWOP update
2022-01-08 16:15:08.136 LOOP: 11 - Data packet is good
2022-01-08 16:15:10.126 LOOP: 12 - Data packet is good
[etc]
I wonder, though, if something isn't releasing a lock or a semaphore somewhere and then causing that update thread to block forever waiting on a lock...
- mcrossley
- Posts: 12767
- Joined: Thu 07 Jan 2010 9:44 pm
- Weather Station: Davis VP2/WLL
- Operating System: Bullseye Lite rPi
- Location: Wilmslow, Cheshire, UK
- Contact:
Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
Could you post a complete log file (with debug) please?
Odd, because MX is not seeing an error on the port - which would trigger a disconnect/reconnect.
I'll take a look at the log and see if I can figure out what is going on...
Odd, because MX is not seeing an error on the port - which would trigger a disconnect/reconnect.
I'll take a look at the log and see if I can figure out what is going on...
-
- Posts: 19
- Joined: Sun 27 Aug 2017 2:46 pm
- Weather Station: Davis Vantage Vue
- Operating System: Windows 10
Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
The log file I have is here: https://www.dropbox.com/s/iip4t60upn647 ... 8.txt?dl=0
I see I didn't have debug turned on for data access to the station, so I've enabled that now for the next time it happens. Once it does I'll be able to post that link as well.
I see I didn't have debug turned on for data access to the station, so I've enabled that now for the next time it happens. Once it does I'll be able to post that link as well.
- mcrossley
- Posts: 12767
- Joined: Thu 07 Jan 2010 9:44 pm
- Weather Station: Davis VP2/WLL
- Operating System: Bullseye Lite rPi
- Location: Wilmslow, Cheshire, UK
- Contact:
Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
OK, thanks, I have had a look at the code and have come up with a situation where this could happen. I've added a fix for that into the next release, plus some more logging when it does happen - at the moment it will fail silently which is what you are seeing.
Of course you will still have the underlying problem of the connection being lost periodically, that still needs looking at.
Of course you will still have the underlying problem of the connection being lost periodically, that still needs looking at.
-
- Posts: 19
- Joined: Sun 27 Aug 2017 2:46 pm
- Weather Station: Davis Vantage Vue
- Operating System: Windows 10
Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
Got a little more detail now with data debugging turned on as well. The problem appears to be that the code is waiting for a response to a LOOP 2 command, and this wait never returns or times out:
What *usually* happens is this:
(etc)
There was also an error in there communicating with Windy, but I somewhat suspect that's unrelated. I'd say the root cause is that "Waiting for LOOP2 data" doesn't behave as though it has a timeout.
Code: Select all
2022-01-14 21:34:17.292 LOOP: 50 - Data packet is good
2022-01-14 21:34:17.293 SendLoopCommand: Starting - LPS 2 1
2022-01-14 21:34:17.293 WakeVP: Not required
2022-01-14 21:34:17.294 SendLoopCommand: Sending command LPS 2 1, attempt 1
2022-01-14 21:34:17.304 SendLoopCommand: Wait for ACK
2022-01-14 21:34:17.304 WaitForACK: Wait for ACK
2022-01-14 21:34:17.342 WaitForACK: ACK received
2022-01-14 21:34:17.342 LOOP2: Waiting for LOOP2 data
2022-01-14 21:35:00.209 DoLogFile: Writing log entry for 1/14/2022 9:35:00 PM
2022-01-14 21:35:00.209 DoLogFile: max gust: 1
2022-01-14 21:35:00.214 DoLogFile: log entry for 1/14/2022 9:35:00 PM written
2022-01-14 21:35:00.215 Writing today.ini, LastUpdateTime = 1/14/2022 9:35:00 PM raindaystart = 9.42 rain counter = 9.43
2022-01-14 21:35:00.220 Windy: URL = https://stations.windy.com/pws/update/<<API_KEY>>?station=0&dateutc=2022-01-15+05:35:00&winddir=214&wind=0.0&gust=0.4&temp=10.0&precip=0.00&pressure=10.2208&dewpoint=7.4&humidity=84
2022-01-14 21:35:00.221 http://www.pwsweather.com/pwsupdate/pwsupdate.php?ID=KCASANMA28&PASSWORD=********&dateutc=2022-01-15+05%3A35%3A00&winddir=214&windspeedmph=0.1&windgustmph=1.0&humidity=84&tempf=50.0&rainin=0.00&dailyrainin=0.01&baromin=30.185&dewptf=45.4&softwaretype=Cumulus%20v3.14.1&action=updateraw
2022-01-14 21:35:00.221 Updating CWOP
2022-01-14 21:35:00.362 Windy: ERROR - An error occurred while sending the request.
2022-01-14 21:35:00.439 PWS Response: OK: <html lang="en">
<head>
<title>PWS Weather Station Update</title>
</head>
<body>
Data Logged and posted in METAR mirror.
</body>
</html>
2022-01-14 21:35:00.458 Sending user and pass to CWOP
2022-01-14 21:35:03.461 Sending: N6OL>APRS,TCPIP*:@150535z3734.83N/12218.93W_214/000g001t050r000p001P001h84b10218eCumulusDsVP
2022-01-14 21:35:06.469 End of CWOP update
2022-01-14 21:36:00.171 *** Data input appears to have stopped
2022-01-14 21:37:00.147 *** Data input appears to have stopped
2022-01-14 21:38:00.188 *** Data input appears to have stopped
Code: Select all
2022-01-14 21:32:39.288 SendLoopCommand: Starting - LPS 2 1
2022-01-14 21:32:39.288 WakeVP: Not required
2022-01-14 21:32:39.289 SendLoopCommand: Sending command LPS 2 1, attempt 1
2022-01-14 21:32:39.298 SendLoopCommand: Wait for ACK
2022-01-14 21:32:39.298 WaitForACK: Wait for ACK
2022-01-14 21:32:39.339 WaitForACK: ACK received
2022-01-14 21:32:39.339 LOOP2: Waiting for LOOP2 data
2022-01-14 21:32:39.374 LOOP2: Data packet is good
2022-01-14 21:32:39.374 LOOP2: Data - 4C-4F-4F-14-01-FF-7F-E6-75-C3-02-36-F5-01-01-FF-D6-00-00-00-00-00-01-00-D6-00-FF-7F-FF-7F-2E-00-FF-54-FF-32-00-32-00-FF-7F-00-00-FF-FF-7F-00-00-FF-FF-01-00-00-00-00-00-00-00-01-00-02-08-00-C8-FF-D3-75-DB-75-E6-75-FF-09-0F-17-08-0A-0E-0B-09-18-0B-0B-FF-7F-FF-7F-FF-7F-FF-7F-FF-7F-FF-7F-0A-0D-84-B1
2022-01-14 21:32:39.374 LOOP2: 10-min gust: 1
There was also an error in there communicating with Windy, but I somewhat suspect that's unrelated. I'd say the root cause is that "Waiting for LOOP2 data" doesn't behave as though it has a timeout.
-
- Posts: 19
- Joined: Sun 27 Aug 2017 2:46 pm
- Weather Station: Davis Vantage Vue
- Operating System: Windows 10
Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
Yeah, sadly I think these Silicon Labs UART/USB chips are not the most reliable things on the planet, but I will locate a new cable and route it directly to the PC, without any hub in the chain some time this weekend and see if that helps at all. It may not help, but it's good to eliminate the obvious things.
- mcrossley
- Posts: 12767
- Joined: Thu 07 Jan 2010 9:44 pm
- Weather Station: Davis VP2/WLL
- Operating System: Bullseye Lite rPi
- Location: Wilmslow, Cheshire, UK
- Contact:
Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
Thanks, I'll take a look at the LOOP2 handling.
-
- Posts: 19
- Joined: Sun 27 Aug 2017 2:46 pm
- Weather Station: Davis Vantage Vue
- Operating System: Windows 10
Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
Of course after I posted an update to the OOME thread, I got an alarm about data input stopping again
It looks like the alarm was actually spurious though. Here's the relevant bit from the logs:
As far as I can tell looking at the log, data input did not actually stop; it got packet 2, it was fine, immediately reported that data input stopped, then kept reading data after that. The console also continues to report that the data is up-to-date. It made it all the way through packet 50 after this, then started a new batch of 50 right after that- everything appears to be continuing normally. Data in the dashboard is continuing to update as you'd expect.
Could this be a side effect of the LOOP2 changes in build 3162 to catch port errors; ie, it recovered silently from a read error, but sent an alarm anyway?
It looks like the alarm was actually spurious though. Here's the relevant bit from the logs:
Code: Select all
2022-01-29 13:16:25.104 SendLoopCommand: Starting - LOOP 50
2022-01-29 13:16:25.104 WakeVP: Not required
2022-01-29 13:16:25.105 SendLoopCommand: Sending command LOOP 50, attempt 1
2022-01-29 13:16:25.105 SendLoopCommand: Wait for ACK
2022-01-29 13:16:25.105 WaitForACK: Wait for ACK
2022-01-29 13:16:25.165 WaitForACK: ACK received
2022-01-29 13:16:25.166 LOOP: Data - 1: 4C-4F-4F-EC-00-5F-09-FD-75-CB-02-35-36-02-04-03-24-00-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-47-FF-FF-FF-FF-FF-FF-FF-00-00-FF-FF-7F-00-00-FF-FF-00-00-21-00-B2-03-00-00-00-00-00-00-FF-FF-FF-FF-FF-FF-FF-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-16-03-06-0C-CB-02-C2-06-0A-0D-7C-19
2022-01-29 13:16:25.166 LOOP: 1 - Data packet is good
2022-01-29 13:16:25.232 LOOP: Data - 2: 4C-4F-4F-EC-00-5F-09-FD-75-CB-02-35-36-02-04-03-24-00-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-47-FF-FF-FF-FF-FF-FF-FF-00-00-FF-FF-7F-00-00-FF-FF-00-00-21-00-B2-03-00-00-00-00-00-00-FF-FF-FF-FF-FF-FF-FF-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-16-03-06-0C-CB-02-C2-06-0A-0D-7C-19
2022-01-29 13:16:25.232 LOOP: 2 - Data packet is good
2022-01-29 13:16:25.337 PWS Response: OK: <html lang="en">
<head>
<title>PWS Weather Station Update</title>
</head>
<body>
Data Logged and posted in METAR mirror.
</body>
</html>
2022-01-29 13:16:25.356 SendEmail: Waiting for lock...
2022-01-29 13:16:25.356 SendEmail: Has the lock
2022-01-29 13:16:25.386 SendEmail: Sending email, to [redacted], subject [Cumulus MX Alarm], body ["A Cumulus MX alarm has been triggered.\r\nCumulus has stopped receiving data from y" +
"our weather station."]...
2022-01-29 13:16:25.497 *** Data input appears to have stopped
2022-01-29 13:16:25.765 SendEmail: Releasing lock...
2022-01-29 13:16:27.064 LOOP: Data - 3: 4C-4F-4F-EC-00-5F-09-FD-75-CB-02-35-36-02-04-03-24-00-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-47-FF-FF-FF-FF-FF-FF-FF-00-00-FF-FF-7F-00-00-FF-FF-00-00-21-00-B2-03-00-00-00-00-00-00-FF-FF-FF-FF-FF-FF-FF-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-16-03-06-0C-CB-02-C2-06-0A-0D-7C-19
2022-01-29 13:16:27.064 LOOP: 3 - Data packet is good
2022-01-29 13:16:29.060 LOOP: Data - 4: 4C-4F-4F-EC-00-5F-09-FD-75-CB-02-35-36-02-04-03-24-00-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-47-FF-FF-FF-FF-FF-FF-FF-00-00-FF-FF-7F-00-00-FF-FF-00-00-21-00-B2-03-00-00-00-00-00-00-FF-FF-FF-FF-FF-FF-FF-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-16-03-06-0C-CB-02-C2-06-0A-0D-7C-19
Could this be a side effect of the LOOP2 changes in build 3162 to catch port errors; ie, it recovered silently from a read error, but sent an alarm anyway?
-
- Posts: 19
- Joined: Sun 27 Aug 2017 2:46 pm
- Weather Station: Davis Vantage Vue
- Operating System: Windows 10
Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection
After building a new PC and then discovering incidentally that the Western Digital Security software causes massive problems for all USB devices on the system and subsequently getting rid of it forever, I'm pleased to report this problem has gone away entirely.
If you have similar problems, check to see if you have Western Digital security tools installed on your system (comes along with their auto-unlocker), and if you do, stop the service when you're not in dire need of using it, or uninstall it forever. It's really harmful to system stability.
If you have similar problems, check to see if you have Western Digital security tools installed on your system (comes along with their auto-unlocker), and if you do, stop the service when you're not in dire need of using it, or uninstall it forever. It's really harmful to system stability.