Welcome to the Cumulus Support forum.

Latest Cumulus MX V3 release 3.28.6 (build 3283) - 21 March 2024

Cumulus MX V4 beta test release 4.0.0 (build 4019) - 03 April 2024

Legacy Cumulus 1 release 1.9.4 (build 1099) - 28 November 2014
(a patch is available for 1.9.4 build 1099 that extends the date range of drop-down menus to 2030)

Download the Software (Cumulus MX / Cumulus 1 and other related items) from the Wiki

Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

From build 3044 the development baton passed to Mark Crossley. Mark has been responsible for all the Builds since. He has made the code available on GitHub. It is Mark's hope that others will join in this development, but at the very least he welcomes your ideas for future developments (see Cumulus MX Development suggestions).

Moderator: mcrossley

senapsys
Posts: 24
Joined: Sun 22 Nov 2015 10:43 am
Weather Station: Davis VP2 Plus
Operating System: CentOS
Location: Wolumla
Contact:

Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

Post by senapsys »

Hi there

So I was running version 3122 since it was released and I recently started experiencing problems with CumulusMX losing contact with the Davis VantagePro2 weather station I have which is connected via a USB data logger. In the console log, everything appears to work fine for a random period of time, and then suddenly the log starts listing "Data input appears to have stopped". There are no other error messages or clues in the log.

My instance of 3122 had been super stable for many many months - indeed sometimes CumulusMX would run for months! I typically update the OS and associated packages every 2 or so weeks, including making sure that Mono is updated to the latest version. If I restart the cumulusmx service, it immediately starts working again, sometimes for just a day, sometimes for many days.

I upgraded to the latest version of CumulusMX and the problem continues. Given that it was super stable, my guess is that perhaps something within the Mono package has changed which might be causing the problem?

I've also tried doing a complete reboot of the machine but the same random behaviour of losing contact continues.

Any insight on what might be causing the problem would be appreciated. I'm happy to provide additional information as needed.
User avatar
mcrossley
Posts: 12756
Joined: Thu 07 Jan 2010 9:44 pm
Weather Station: Davis VP2/WLL
Operating System: Bullseye Lite rPi
Location: Wilmslow, Cheshire, UK
Contact:

Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

Post by mcrossley »

Your profile says you are running CentOS, you could check that USB power saving is disabled.
freddie
Posts: 2471
Joined: Wed 08 Jun 2011 11:19 am
Weather Station: Davis Vantage Pro 2 + Ecowitt
Operating System: GNU/Linux Ubuntu 22.04 LXC
Location: Alcaston, Shropshire, UK
Contact:

Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

Post by freddie »

Have you tried a different USB cable? It could be your existing one has become less reliable.
Freddie
Image
senapsys
Posts: 24
Joined: Sun 22 Nov 2015 10:43 am
Weather Station: Davis VP2 Plus
Operating System: CentOS
Location: Wolumla
Contact:

Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

Post by senapsys »

Thanks for the replies :-)

Yes, initially I re-seated the USB cable and later I did try a different cable to be sure.

I am indeed running CentOS and after checking the control and autosuspend_delay_ms files for the USB port that the weather station is connected to, both are set to values which disable any power saving by the system or any application.
senapsys
Posts: 24
Joined: Sun 22 Nov 2015 10:43 am
Weather Station: Davis VP2 Plus
Operating System: CentOS
Location: Wolumla
Contact:

Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

Post by senapsys »

The other thing that might be worth mentioning is that when CumulusMX is operating normally, the systemctl stop cumulusmx command executes very quickly (within a second). When contact with the station has been lost, the attempt to stop the service can take between 30 and 60 seconds. As mentioned previously, once stopped, the cumulusmx service starts again without any issues and without needing to reboot or kill any processes.
senapsys
Posts: 24
Joined: Sun 22 Nov 2015 10:43 am
Weather Station: Davis VP2 Plus
Operating System: CentOS
Location: Wolumla
Contact:

Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

Post by senapsys »

So a further update...

I happened to be working on the server when the data input appeared to stop. Looking around, one of the things I immediately noticed was that the Mono process was consuming all of the available CPU. When I stopped the CumulusMX process, the mono process also stopped and the CPU load returned to normal. When I restarted CumulusMX, the mono process also started (as expected) and was sitting at a more normal level of just a couple of percent.

There's definitely something weird going on...
spatula
Posts: 19
Joined: Sun 27 Aug 2017 2:46 pm
Weather Station: Davis Vantage Vue
Operating System: Windows 10

Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

Post by spatula »

Just a bump on this: I'm seeing the same behavior on my Davis Vantage Vue now with build 3160, something that started happening with one of the most recent 2-3 builds, around the same time you started seeing this. There's nothing else of note in the logs even with debugging turned on, and nothing sent to the console about it. Restarting Cumulus resolves the issue immediately. The USB device is configured never to go to sleep to save power and has never given me any trouble before a month or two ago.

Once it gets into the "data input appears to have stopped" state, it never recovers without restarting CumulusMX.

Here's all I got in the logs:

2022-01-08 16:16:26.213 SendLoopCommand: Starting - LOOP 50
2022-01-08 16:16:26.213 WakeVP: Not required
2022-01-08 16:16:26.213 SendLoopCommand: Sending command LOOP 50, attempt 1
2022-01-08 16:16:26.214 SendLoopCommand: Wait for ACK
2022-01-08 16:16:26.214 WaitForACK: Wait for ACK
2022-01-08 16:16:26.275 WaitForACK: ACK received
2022-01-08 16:16:26.276 LOOP: 1 - Data packet is good
[ then more packets all the way up to...]
2022-01-08 16:17:42.133 LOOP: 39 - Data packet is good
2022-01-08 16:19:00.221 *** Data input appears to have stopped
2022-01-08 16:20:00.270 *** Data input appears to have stopped
2022-01-08 16:21:00.241 *** Data input appears to have stopped
(and every minute after that until I restart it)

One thing of note is that it looks like outbound updates were getting sent at the same time that a previous query was being sent, but by the time this query ran, it doesn't look like it was stepping on anything:

2022-01-08 16:15:02.133 LOOP: 8 - Data packet is good
2022-01-08 16:15:03.704 Sending: N6OL>APRS,TCPIP*[stuff]
2022-01-08 16:15:04.136 LOOP: 9 - Data packet is good
2022-01-08 16:15:06.127 LOOP: 10 - Data packet is good
2022-01-08 16:15:06.716 End of CWOP update
2022-01-08 16:15:08.136 LOOP: 11 - Data packet is good
2022-01-08 16:15:10.126 LOOP: 12 - Data packet is good
[etc]

I wonder, though, if something isn't releasing a lock or a semaphore somewhere and then causing that update thread to block forever waiting on a lock...
User avatar
mcrossley
Posts: 12756
Joined: Thu 07 Jan 2010 9:44 pm
Weather Station: Davis VP2/WLL
Operating System: Bullseye Lite rPi
Location: Wilmslow, Cheshire, UK
Contact:

Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

Post by mcrossley »

Could you post a complete log file (with debug) please?

Odd, because MX is not seeing an error on the port - which would trigger a disconnect/reconnect.

I'll take a look at the log and see if I can figure out what is going on...
spatula
Posts: 19
Joined: Sun 27 Aug 2017 2:46 pm
Weather Station: Davis Vantage Vue
Operating System: Windows 10

Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

Post by spatula »

The log file I have is here: https://www.dropbox.com/s/iip4t60upn647 ... 8.txt?dl=0

I see I didn't have debug turned on for data access to the station, so I've enabled that now for the next time it happens. Once it does I'll be able to post that link as well.
User avatar
mcrossley
Posts: 12756
Joined: Thu 07 Jan 2010 9:44 pm
Weather Station: Davis VP2/WLL
Operating System: Bullseye Lite rPi
Location: Wilmslow, Cheshire, UK
Contact:

Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

Post by mcrossley »

OK, thanks, I have had a look at the code and have come up with a situation where this could happen. I've added a fix for that into the next release, plus some more logging when it does happen - at the moment it will fail silently which is what you are seeing.

Of course you will still have the underlying problem of the connection being lost periodically, that still needs looking at.
spatula
Posts: 19
Joined: Sun 27 Aug 2017 2:46 pm
Weather Station: Davis Vantage Vue
Operating System: Windows 10

Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

Post by spatula »

Got a little more detail now with data debugging turned on as well. The problem appears to be that the code is waiting for a response to a LOOP 2 command, and this wait never returns or times out:

Code: Select all

2022-01-14 21:34:17.292 LOOP: 50 - Data packet is good
2022-01-14 21:34:17.293 SendLoopCommand: Starting - LPS 2 1
2022-01-14 21:34:17.293 WakeVP: Not required
2022-01-14 21:34:17.294 SendLoopCommand: Sending command LPS 2 1,  attempt 1
2022-01-14 21:34:17.304 SendLoopCommand: Wait for ACK
2022-01-14 21:34:17.304 WaitForACK: Wait for ACK
2022-01-14 21:34:17.342 WaitForACK: ACK received
2022-01-14 21:34:17.342 LOOP2: Waiting for LOOP2 data
2022-01-14 21:35:00.209 DoLogFile: Writing log entry for 1/14/2022 9:35:00 PM
2022-01-14 21:35:00.209 DoLogFile: max gust: 1
2022-01-14 21:35:00.214 DoLogFile: log entry for 1/14/2022 9:35:00 PM written
2022-01-14 21:35:00.215 Writing today.ini, LastUpdateTime = 1/14/2022 9:35:00 PM raindaystart = 9.42 rain counter = 9.43
2022-01-14 21:35:00.220 Windy: URL = https://stations.windy.com/pws/update/<<API_KEY>>?station=0&dateutc=2022-01-15+05:35:00&winddir=214&wind=0.0&gust=0.4&temp=10.0&precip=0.00&pressure=10.2208&dewpoint=7.4&humidity=84
2022-01-14 21:35:00.221 http://www.pwsweather.com/pwsupdate/pwsupdate.php?ID=KCASANMA28&PASSWORD=********&dateutc=2022-01-15+05%3A35%3A00&winddir=214&windspeedmph=0.1&windgustmph=1.0&humidity=84&tempf=50.0&rainin=0.00&dailyrainin=0.01&baromin=30.185&dewptf=45.4&softwaretype=Cumulus%20v3.14.1&action=updateraw
2022-01-14 21:35:00.221 Updating CWOP
2022-01-14 21:35:00.362 Windy: ERROR - An error occurred while sending the request.
2022-01-14 21:35:00.439 PWS Response: OK: <html lang="en">
<head>
    <title>PWS Weather Station Update</title>
</head>
<body>
Data Logged and posted in METAR mirror.
</body>
</html>
2022-01-14 21:35:00.458 Sending user and pass to CWOP
2022-01-14 21:35:03.461 Sending: N6OL>APRS,TCPIP*:@150535z3734.83N/12218.93W_214/000g001t050r000p001P001h84b10218eCumulusDsVP
2022-01-14 21:35:06.469 End of CWOP update
2022-01-14 21:36:00.171 *** Data input appears to have stopped
2022-01-14 21:37:00.147 *** Data input appears to have stopped
2022-01-14 21:38:00.188 *** Data input appears to have stopped

What *usually* happens is this:

Code: Select all

2022-01-14 21:32:39.288 SendLoopCommand: Starting - LPS 2 1
2022-01-14 21:32:39.288 WakeVP: Not required
2022-01-14 21:32:39.289 SendLoopCommand: Sending command LPS 2 1,  attempt 1
2022-01-14 21:32:39.298 SendLoopCommand: Wait for ACK
2022-01-14 21:32:39.298 WaitForACK: Wait for ACK
2022-01-14 21:32:39.339 WaitForACK: ACK received
2022-01-14 21:32:39.339 LOOP2: Waiting for LOOP2 data
2022-01-14 21:32:39.374 LOOP2: Data packet is good
2022-01-14 21:32:39.374 LOOP2: Data - 4C-4F-4F-14-01-FF-7F-E6-75-C3-02-36-F5-01-01-FF-D6-00-00-00-00-00-01-00-D6-00-FF-7F-FF-7F-2E-00-FF-54-FF-32-00-32-00-FF-7F-00-00-FF-FF-7F-00-00-FF-FF-01-00-00-00-00-00-00-00-01-00-02-08-00-C8-FF-D3-75-DB-75-E6-75-FF-09-0F-17-08-0A-0E-0B-09-18-0B-0B-FF-7F-FF-7F-FF-7F-FF-7F-FF-7F-FF-7F-0A-0D-84-B1
2022-01-14 21:32:39.374 LOOP2: 10-min gust: 1
(etc)

There was also an error in there communicating with Windy, but I somewhat suspect that's unrelated. I'd say the root cause is that "Waiting for LOOP2 data" doesn't behave as though it has a timeout.
spatula
Posts: 19
Joined: Sun 27 Aug 2017 2:46 pm
Weather Station: Davis Vantage Vue
Operating System: Windows 10

Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

Post by spatula »

mcrossley wrote: Thu 13 Jan 2022 6:51 pm Of course you will still have the underlying problem of the connection being lost periodically, that still needs looking at.
Yeah, sadly I think these Silicon Labs UART/USB chips are not the most reliable things on the planet, but I will locate a new cable and route it directly to the PC, without any hub in the chain some time this weekend and see if that helps at all. It may not help, but it's good to eliminate the obvious things.
User avatar
mcrossley
Posts: 12756
Joined: Thu 07 Jan 2010 9:44 pm
Weather Station: Davis VP2/WLL
Operating System: Bullseye Lite rPi
Location: Wilmslow, Cheshire, UK
Contact:

Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

Post by mcrossley »

Thanks, I'll take a look at the LOOP2 handling.
spatula
Posts: 19
Joined: Sun 27 Aug 2017 2:46 pm
Weather Station: Davis Vantage Vue
Operating System: Windows 10

Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

Post by spatula »

Of course after I posted an update to the OOME thread, I got an alarm about data input stopping again :lol:

It looks like the alarm was actually spurious though. Here's the relevant bit from the logs:

Code: Select all

2022-01-29 13:16:25.104 SendLoopCommand: Starting - LOOP 50
2022-01-29 13:16:25.104 WakeVP: Not required
2022-01-29 13:16:25.105 SendLoopCommand: Sending command LOOP 50,  attempt 1
2022-01-29 13:16:25.105 SendLoopCommand: Wait for ACK
2022-01-29 13:16:25.105 WaitForACK: Wait for ACK
2022-01-29 13:16:25.165 WaitForACK: ACK received
2022-01-29 13:16:25.166 LOOP: Data - 1: 4C-4F-4F-EC-00-5F-09-FD-75-CB-02-35-36-02-04-03-24-00-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-47-FF-FF-FF-FF-FF-FF-FF-00-00-FF-FF-7F-00-00-FF-FF-00-00-21-00-B2-03-00-00-00-00-00-00-FF-FF-FF-FF-FF-FF-FF-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-16-03-06-0C-CB-02-C2-06-0A-0D-7C-19
2022-01-29 13:16:25.166 LOOP: 1 - Data packet is good
2022-01-29 13:16:25.232 LOOP: Data - 2: 4C-4F-4F-EC-00-5F-09-FD-75-CB-02-35-36-02-04-03-24-00-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-47-FF-FF-FF-FF-FF-FF-FF-00-00-FF-FF-7F-00-00-FF-FF-00-00-21-00-B2-03-00-00-00-00-00-00-FF-FF-FF-FF-FF-FF-FF-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-16-03-06-0C-CB-02-C2-06-0A-0D-7C-19
2022-01-29 13:16:25.232 LOOP: 2 - Data packet is good
2022-01-29 13:16:25.337 PWS Response: OK: <html lang="en">
<head>
    <title>PWS Weather Station Update</title>
</head>
<body>
Data Logged and posted in METAR mirror.
</body>
</html>
2022-01-29 13:16:25.356 SendEmail: Waiting for lock...
2022-01-29 13:16:25.356 SendEmail: Has the lock
2022-01-29 13:16:25.386 SendEmail: Sending email, to [redacted], subject [Cumulus MX Alarm], body ["A Cumulus MX alarm has been triggered.\r\nCumulus has stopped receiving data from y" +
    "our weather station."]...
2022-01-29 13:16:25.497 *** Data input appears to have stopped
2022-01-29 13:16:25.765 SendEmail: Releasing lock...
2022-01-29 13:16:27.064 LOOP: Data - 3: 4C-4F-4F-EC-00-5F-09-FD-75-CB-02-35-36-02-04-03-24-00-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-47-FF-FF-FF-FF-FF-FF-FF-00-00-FF-FF-7F-00-00-FF-FF-00-00-21-00-B2-03-00-00-00-00-00-00-FF-FF-FF-FF-FF-FF-FF-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-16-03-06-0C-CB-02-C2-06-0A-0D-7C-19
2022-01-29 13:16:27.064 LOOP: 3 - Data packet is good
2022-01-29 13:16:29.060 LOOP: Data - 4: 4C-4F-4F-EC-00-5F-09-FD-75-CB-02-35-36-02-04-03-24-00-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-FF-47-FF-FF-FF-FF-FF-FF-FF-00-00-FF-FF-7F-00-00-FF-FF-00-00-21-00-B2-03-00-00-00-00-00-00-FF-FF-FF-FF-FF-FF-FF-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-16-03-06-0C-CB-02-C2-06-0A-0D-7C-19
As far as I can tell looking at the log, data input did not actually stop; it got packet 2, it was fine, immediately reported that data input stopped, then kept reading data after that. The console also continues to report that the data is up-to-date. It made it all the way through packet 50 after this, then started a new batch of 50 right after that- everything appears to be continuing normally. Data in the dashboard is continuing to update as you'd expect.

Could this be a side effect of the LOOP2 changes in build 3162 to catch port errors; ie, it recovered silently from a read error, but sent an alarm anyway?
spatula
Posts: 19
Joined: Sun 27 Aug 2017 2:46 pm
Weather Station: Davis Vantage Vue
Operating System: Windows 10

Re: Data Input Appears to Have Stopped - Davis VantagePro2 with a USB connection

Post by spatula »

After building a new PC and then discovering incidentally that the Western Digital Security software causes massive problems for all USB devices on the system and subsequently getting rid of it forever, I'm pleased to report this problem has gone away entirely.

If you have similar problems, check to see if you have Western Digital security tools installed on your system (comes along with their auto-unlocker), and if you do, stop the service when you're not in dire need of using it, or uninstall it forever. It's really harmful to system stability.
Post Reply