Page 1 of 1

Too many open files

Posted: Thu 25 Feb 2021 2:00 pm
by stewartwlewis
Hi
I have CumulusMX release b3107 running on a raspberry pi 3 model B using Buster with Mono 5.18.
CumulusMX runs fine for weeks then I get a "Too many open files" error then more and more errors of the same and System.IO errors until nothing works properly.
I replaced the SD card with a new 32GB one so there is plenty of space, 29GB free. Moved CumulusMX to running as a service and made sure it's running up to date OS, Mono and CumulusMX version but I still get the same error.
My weather site is running on GoDaddy shared services and I am using less than 2% of the allowed storage and files. I also have different RPI on the same network which updates another databsase (for my garden data) also running on GoDaddy. This updates every 2 minutes and the those logs show no interruptions in service during the same period CumulusMX has the "Too many open files" errors.
I understand this is an OS error even though it looks like an FTP error in the logs.
All help on what I should do to fix this issue welcome. Should I just reboot the RPI every day/week?

I have attached the relevant logs both run and FTP. The first error happened at 2021-02-24 23:30:17.72

Thanks

Re: Too many open files

Posted: Thu 25 Feb 2021 3:00 pm
by mcrossley
It does look like it is the FTP uploads that start the problem, but the MX code does seem to be correctly closing both the local and remote file even on error. After a while it affects the local files as well. So it appears that the failed FTP is also leaving the local file open despite MX requesting it be closed. I'm not sure what to suggest to be honest.

As an side your AWEKAS uploads are failing due to being rate limited, unless you have a web file or Pro account then the minimum upload period is 5 minutes.

Re: Too many open files

Posted: Thu 25 Feb 2021 4:11 pm
by galfert
AWEKAS allows for 1 minute upload interval if you opt in for the free Plus account, otherwise you are limited to 5 minute uploads. There is an even higher account that is paid that will allow 15 second uploads and it will also give you a nicer custom website and that paid service is called StationsWeb.

To recap...AWEKAS has 3 service levels:
- StationsWeb; paid subscription that allows 15 second uploads and custom website
- Plus; free but you must opt in to allow data use according to terms, 1 minute uploads and other features*
- standard; free with 5 minute uploads...your data will not be used for commercial purposes

*Details on free Plus account:
https://www.awekas.at/for2/index.php?th ... -features/

Re: Too many open files

Posted: Thu 25 Feb 2021 7:46 pm
by stewartwlewis
Thanks for the AWEKAS information - I will fix that. I thought they just limited connections when busy!
On the Too many open files issue, all the research I have done suggests this is an OS error report. This doesn't seem to be a standard FTP error report. When the FTP command executes I presume it writes a temporary file? Could that be where the error comes from? Guess I will reboot periodically and see if this gets rid of the issue.
Thanks for looking at this.

Re: Too many open files

Posted: Thu 25 Feb 2021 8:42 pm
by mcrossley
Each FTP operation has to open the local file to read the contents.

Something else you could investigate is increasing the number of files a process is allowed to open. How you do this and how you check how many files the Cumulus process currently has open varies a bit by Linux OS, but a bit of Googling will help. My copy of Cumulus seems to run with the number of files open concurrently in the low 20's, with a soft limit of 1024. But I do not use FTP.

Re: Too many open files

Posted: Fri 26 Feb 2021 5:47 pm
by stewartwlewis
Thanks Mark, lots of Googling and this is where I got to.
I looked a the open files for the CumulusMX process and there were none! After a bit of head scratching CumulusMX is running under Mono and Mono does all the interaction with the OS. Checking open files in Mono there are currently 185, there a some that are clearly for CumulusMX, some for OS interaction but the vast majority are for the TCP protocol and that cannot be right. Watching through today the number of open files just increases. They come in blocks but looking at the FTP and runtime logs there is nothing odd at the same time as these sockets/files are left open. Frome where they start, rebooting sets the number back to zero.
I Cant find anything for Mono that suggests an explanation so unless anybody has a bright idea or I have missed something obvious - very likely - I guess it's a case of re-installing Mono then if thats doesn't work, a full OS re-build or just chicken out and re-boot every night!

I have attached a list of the open "files", names and datas and the run time and FTP logs. I haven't fixed AWEKAS yet!

________
In case anybody has the same problem this is what I did so far. Open files are called File Descriptors or FD's.
Get the Cumulus process numbers with:
$ ps aux | grep CumulusMX
Which gave me

root 1008 24.5 11.9 212436 113508 ? Sl Feb25 334:29 /usr/bin/mono /usr/lib/mono/4.5/mono-service.exe -d:/home/pi/CumulusMX CumulusMX.exe -service
pi 3839 0.0 0.0 7480 516 pts/0 S+ 11:17 0:00 grep --color=auto CumulusMX

Two PID's, one for Mono running CululusMX and one for CumulusMX.
There were no files running under the CumulusMX PID for me.

The Mono service is running as root and I found it easier to change to root to check what was going on so
$ sudo su

you can display the open files using lsof and the PID from above.
# lsof -a -p 1008
This does work but you do need to install lsof using apt if you are using the light version of the OS.
Also this gives you detail including size but not the date opened. To get this information I had to change the the relevant directory using
# cd /proc/1008/fd

then

# ls -l | less
To list the files and

# ls -l | wc -l
To count the number of files.

To check the hard and soft limits on how many files a user session you can have open use
# ulimit -Hn
# ulimit -Sn


To check the number of files open for a user session
# cat /proc/sys/fs/file-nr | awk '{print $1}'

To check the maximum available number of open files
#cat /proc/sys/fs/file-max

I found the relationship between the Unix file management user/soft/hard/session numbers way beyond me so stopped here!