Welcome to the Cumulus Support forum.

Latest Cumulus MX V3 release 3.28.6 (build 3283) - 21 March 2024

Cumulus MX V4 beta test release 4.0.0 (build 4017) - 17 March 2024

Legacy Cumulus 1 release v1.9.4 (build 1099) - 28 November 2014 (a patch is available for 1.9.4 build 1099 that extends the date range of drop-down menus to 2030)

Download the Software (Cumulus MX / Cumulus 1 and other related items) from the Wiki

[Implemented] ExportSQL observations

A Forum to archive Cumulus MX development suggestions that have been rejected or solved by other means.
Post Reply
User avatar
HansR
Posts: 5870
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bullseye
Location: Wagenborgen (NL)
Contact:

[Implemented] ExportSQL observations

Post by HansR »

I am experimenting a bit with the SQL tables and tools and techniques specifically for performance and fetching. Until now I have the following observations:
  • Correction: naming of the monthfiles.
    The argument Monthly of ExportSQL does not automatically take all monthly files to import (as it should) because the name is created by a date formatting function and the language is apparently default set to GB (my machine is NL). As a result the names don't match and it stops at/after the first file. Workaround: batch file with the command for each file.
  • Research: acceleration.
    The whole process is very slow. It takes more than 2 hours to import a roughly 150.000 records (20/sec). If you have to rebuild some time with larger series than I have at the moment (I have 20 months) this seems not practical. Could this be accelerated? I remember from my Oracle days records could be dumped in blocks and other optimising actions, a bit like here; Or use even a Bulk insert straight from file (I have never done that btw). Any other ideas? Here are two more links [1] & [2].
    Currently I am running for 3 hours and not even halfway my 20 files or so. NOTE: This is a RPi 3B+ with MariaDB on localhost.
  • Uncertain:
    I notice that the wind direction is modified into the compass texts (e.g. WNW). With that you loose the degrees with which it is originally logged. That means the data is modified and I have to calculate it back - e.g. creating the windrose - which can't be done. The direction in degrees is still required I think. NOTE: I checked and I think the degrees value is still in the table so the text is an addition. Is this correct?
Anyway.... Maybe I add more notes while experience progresses.
Hans

https://meteo-wagenborgen.nl
CMX build 4017+ ● RPi 3B+ ● Raspbian Linux 6.1.21-v7+ armv7l ● dotnet 8.0.3
Mapantz
Posts: 1774
Joined: Sat 17 Dec 2011 11:55 am
Weather Station: Davis Vantage Pro2
Operating System: Windows 11 x64
Location: Dorset - UK
Contact:

Re: ExportSQL observations

Post by Mapantz »

When you say 150,000 records, do you means rows?

I don't know much about Pi's but I would have thought it would generally be slow anyway - SD card read and write speeds are pants.
Image
User avatar
HansR
Posts: 5870
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bullseye
Location: Wagenborgen (NL)
Contact:

Re: ExportSQL observations

Post by HansR »

Mapantz wrote: Wed 10 Feb 2021 8:33 pm When you say 150,000 records, do you means rows?
Yes
Mapantz wrote: Wed 10 Feb 2021 8:33 pm I don't know much about Pi's but I would have thought it would generally be slow anyway - SD card read and write speeds are pants
That is a very wrong attitude, it simply is non optimized insertion. Read the links i gave, it can be a lot faster no matter how slow the device.
Hans

https://meteo-wagenborgen.nl
CMX build 4017+ ● RPi 3B+ ● Raspbian Linux 6.1.21-v7+ armv7l ● dotnet 8.0.3
Mapantz
Posts: 1774
Joined: Sat 17 Dec 2011 11:55 am
Weather Station: Davis Vantage Pro2
Operating System: Windows 11 x64
Location: Dorset - UK
Contact:

Re: ExportSQL observations

Post by Mapantz »

HansR wrote: Wed 10 Feb 2021 8:50 pm That is a very wrong attitude, it simply is non optimized insertion. Read the links i gave, it can be a lot faster no matter how slow the device.
I don't agree with that. The links you provided mentioned nothing about SD cards.

Assuming that's what you're using? Then there's going to be limitations.

If you are using an SD card, I think that's unwise for a database to be stored on, period.
Image
User avatar
HansR
Posts: 5870
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bullseye
Location: Wagenborgen (NL)
Contact:

Re: ExportSQL observations

Post by HansR »

I am not saying sd cards are ideal storage here, i am saying insertion can and should be more efficient.
It's not a discussion about sd cards, it's about amount of data and how to handle that.
I have 45.000 records per month - and I won't be the only one - that should be enough to think about the subject.

Speed is relative, irrelevant to the device.
Hans

https://meteo-wagenborgen.nl
CMX build 4017+ ● RPi 3B+ ● Raspbian Linux 6.1.21-v7+ armv7l ● dotnet 8.0.3
User avatar
HansR
Posts: 5870
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bullseye
Location: Wagenborgen (NL)
Contact:

Re: ExportSQL observations

Post by HansR »

OK, just continuing this thread with speed observations and timing.

I have a non optimised RPi 3B+ with a 16Gb SD card
I have a write optimised Synology NAS 214+ with 2x3TB drives, configured in one volume with SHR.

Both systems run MariaDB 10.

The RPi fills the database with ExportSQL.exe in 43 minutes (+/- 2 minutes) per monthly logfile
The NAS fills the database with ExportSQL.exe in 10 minutes (+/- 1.5 minutes) per monthly logfile

So the NAS is only 4.3 times faster than the RPi localhost. That is not as large a difference as I thought.
And my guess is that it can be more accelerated but that would be with changing insertion techniques and the acceleration would be the same roughly on both systems.
I'll keep you posted.
Hans

https://meteo-wagenborgen.nl
CMX build 4017+ ● RPi 3B+ ● Raspbian Linux 6.1.21-v7+ armv7l ● dotnet 8.0.3
User avatar
HansR
Posts: 5870
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bullseye
Location: Wagenborgen (NL)
Contact:

Re: ExportSQL observations

Post by HansR »

OK, again continuing this thread with speed observations and timing.

I added the following experiments:
  1. Using a remote MySQL database at my provider;
  2. Modifying ExportMySQL.exe for multiple row import (strangely the name Export actually imports the logfile data into SQL ;) )
  3. Running timing tests to compare buffersizes on import speed for the external database
  4. Doing the timing exercise on the local SD-card MAriaDB database
Results:
  1. The remote database had database fills with ExportSQL in 16 minutes (+/- 2 minutes) per monthly logfile.
    (compare with the database fills in my previous post)
  2. I modified ExportMySQL to create the SQL queries for multiple record insert according to this article. This had only few modifications to the code. In a second stage I applied a modification to the StringBuilder to prevent buffer enlargement while creating the insert array, in general I minimised buffer manipulation to a minimum.
  3. Timing results (for the remote database, so compare with the above figure of 16 minutes) are as follows:
    10 records at a time: 2 minutes average per monthly logfile
    100 records at a time: 37 seconds average per monthly logfile
    1000 records at a time: 30 seconds average per monthly logfile
  4. So then it became interesting what the code change meant for an SD-card , therefore I turned back to the localhost system (compare with the 43 minutes in prev. post)
    100 records at a time: 5 minutes +/- 1 minute per monthly logfile
    1000 records at a time: 2 min 30 +/- 30 seconds per monthly logfile
I don't think the above needs additional comment and I think I proved my point.
Buffer size of a 100 records at a time seems OK although clearly on SD cards the larger arrays are another two times as fast. You may want to make the buffer size user modifiable through the command line.

Although filling the database is not a regular exercise, it needs to be done at least once and because the log files are the only data of CMX with guaranteed (?) integrity and completeness you might want to redo the database creation. Also if the database disconnects, power outage or other disasters, you may need to refill or add data from the logfile.

I'll make the modifications available.
Hans

https://meteo-wagenborgen.nl
CMX build 4017+ ● RPi 3B+ ● Raspbian Linux 6.1.21-v7+ armv7l ● dotnet 8.0.3
User avatar
mcrossley
Posts: 12689
Joined: Thu 07 Jan 2010 9:44 pm
Weather Station: Davis VP2/WLL
Operating System: Bullseye Lite rPi
Location: Wilmslow, Cheshire, UK
Contact:

Re: ExportSQL observations

Post by mcrossley »

Thanks for looking at this Hans. The name ExportSQL confused me too, I guess Steve meant it was exporting from MX to SQL.
User avatar
HansR
Posts: 5870
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bullseye
Location: Wagenborgen (NL)
Contact:

Re: ExportSQL observations

Post by HansR »

mcrossley wrote: Tue 16 Feb 2021 12:20 pm Thanks for looking at this Hans. The name ExportSQL confused me too, I guess Steve meant it was exporting from MX to SQL.
You're welcome. And indeed, name giving is a matter of inclination and starting point ;)
Hans

https://meteo-wagenborgen.nl
CMX build 4017+ ● RPi 3B+ ● Raspbian Linux 6.1.21-v7+ armv7l ● dotnet 8.0.3
Post Reply