Welcome to the Cumulus Support forum.

Latest Cumulus MX V3 release 3.28.6 (build 3283) - 21 March 2024

Cumulus MX V4 beta test release 4.0.0 (build 4017) - 17 March 2024

Legacy Cumulus 1 release v1.9.4 (build 1099) - 28 November 2014 (a patch is available for 1.9.4 build 1099 that extends the date range of drop-down menus to 2030)

Download the Software (Cumulus MX / Cumulus 1 and other related items) from the Wiki

Proposal for Air Quality storage schema

Please DO NOT use this to publish your entire wish. This Forum is for specific suggestions to enhance the usability of Cumulus MX for all users, NOT your personal requirements. Please check this forum and the rejected forum to make sure you are NOT posting a DUPLICATE suggestion. It will be heavily monitored by Admin and Mark Crossley to determine the feasibility and the difficulty of the suggestion. Those Topics that are deemed inadmissible will moved to the rejected Forum. The remaining Topics will be the Accepted list of future developments, and when our voluntary development group adds it to a build, the build number will be added to the Topic title.
Post Reply
User avatar
HansR
Posts: 5870
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bullseye
Location: Wagenborgen (NL)
Contact:

Proposal for Air Quality storage schema

Post by HansR »

mcrossley wrote: Mon 08 Feb 2021 11:49 am
HansR wrote: Mon 08 Feb 2021 8:40 am And beyond that: nobody reacted on my remark above about:
HansR wrote: Tue 02 Feb 2021 10:28 am device independent Air Quality datamodel is needed before adding more of that type of devices.
Do I have to interpret this lack of reaction as 'the datamodel will just expanded be with every AQ device'. That seems weird as the meteo stations do share tables and fields. Multiple meteo stations require multiple instances of CMX. Should multiple AQ stations not require the same?
It does! But implementing this will probably mean dropping device specific data and like the weather data just storing common attributes. If anybody wants to kick off a discussion on the attributes to store then that could be beneficial.
If we can do this in this thread (otherwise separate into a new one), I'll give a kick (constructive comments only ;) ) :
  1. I think indeed device specific data is not required (unless anybody can convince me). Like temperature, PM or CO2 (NOx or whatever chemical you wish to measure) concentration is a measurement;
  2. Differences between devices (e.g. a GW1000 has PM and CO2, a PurpleAir has only PM) are only reflected in validity of the data and whether they are displayed.
  3. The number of AQ devices must be limited (e.g. no more than two or three)
  4. Cycle of storage must be limited like the meteo data (1, 5, 10, 20, 30 minutes). Storing every minute a value for every measurement is pretty extreme and for historical data hardly necessary. So the monthly AQ data can be reduced. Otherwise: consider compression of the data after month completion (and decompression when required).
  5. Derivatives like nowcast, averages etc... can be calculated or supplied by the device. That should be handled by the device driver.
  6. It needs to be decided which derivatives will be carried by CMX (just like all (weird) temperature derivatives). As CMX is truly useful anywhere, the data model should not be the limit but the user choice should in what he wants to display (in page or graphical)
  7. If history is not supported: don't bother, a reboot of the sensor just restarts the measurements and averages.
So in summary I would make it as generic as possible:
  1. Use one Airquality table for PM (for up to n devices) either device support in row length or device support as a record identifier
  2. Use one AirQuality table for Chemical (for up to n devices - typically 4: CO2, NO2, NO3 and a fourth (SO2 or organic etc...). Same for device support as above.
  3. Device support is not that distinction between devices is made but that you know which devices measured what. Especially when using SQL it is very easy to make sub-selections like this.
  4. Store the PM current (minute) values, hour averages, 3 hr, 24 hr and nowcast values in the same frequency as the meteo values
  5. Store gasses concentrations in the same frequency as the meteo values
  6. Make these values accessible for graphing (selection by the user) just like the meteo values (this could be extended)
Hans

https://meteo-wagenborgen.nl
CMX build 4017+ ● RPi 3B+ ● Raspbian Linux 6.1.21-v7+ armv7l ● dotnet 8.0.3
jon_iz
Posts: 86
Joined: Sat 02 Jan 2016 10:10 pm
Weather Station: Davis VP2+, WLL & Airlink
Operating System: Win 10 64bit / RPi Buster
Location: Nantwich, UK
Contact:

Re: Proposal for Air Quality storage schema

Post by jon_iz »

should there also be an entry of AQI in addition to PM values - or is this calculated on the fly?
User avatar
HansR
Posts: 5870
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bullseye
Location: Wagenborgen (NL)
Contact:

Re: Proposal for Air Quality storage schema

Post by HansR »

Personally I don't think they are required in the database. AQI can be calculated on the fly when asked for (btw just like all derivatives) so it is just a matter of taste and choice (and speed/response issues). As they are already in the database, maybe just leave it.

On the other hand: if there are errors in the calculations it is very difficult to recover from errors in the database and the errors probably remain there forever (unless some kind of reconciliation function gets implemented to correct database errors). And as AQI is very much a national thing and you may even want to change the AQI to another definition (what if Brexit gets annulled and you rejoin the EU? ;) ), I would advice to do it on the fly, it saves space on disk as well.
Hans

https://meteo-wagenborgen.nl
CMX build 4017+ ● RPi 3B+ ● Raspbian Linux 6.1.21-v7+ armv7l ● dotnet 8.0.3
User avatar
mcrossley
Posts: 12689
Joined: Thu 07 Jan 2010 9:44 pm
Weather Station: Davis VP2/WLL
Operating System: Bullseye Lite rPi
Location: Wilmslow, Cheshire, UK
Contact:

Re: Proposal for Air Quality storage schema

Post by mcrossley »

The problem with on the fly calculations is that when reading historic data for graphs say, you then have to calculate the AQI maybe hundreds of thousands of times for every time you read the data - assuming a decade of two of data, I'm looking to the future.
User avatar
HansR
Posts: 5870
Joined: Sat 20 Oct 2012 6:53 am
Weather Station: GW1100 (WS80/WH40)
Operating System: Raspberry OS/Bullseye
Location: Wagenborgen (NL)
Contact:

Re: Proposal for Air Quality storage schema

Post by HansR »

mcrossley wrote: Wed 10 Feb 2021 2:00 pm The problem with on the fly calculations is that when reading historic data for graphs say, you then have to calculate the AQI maybe hundreds of thousands of times for every time you read the data - assuming a decade of two of data, I'm looking to the future.
Agree, that would not be a great feature. A bit like chess playing searching all positions. That is not a brilliant method because that would be a lot of non-required calculations.

So, well, don't select such long period. Seriously: who is interested in a graph of decades compressed AQ concentrations and AQI data. If you offer the concentrations and the users zooms in on a high concentration you might offer AQI when the period is short enough. So calculation on demand I would say. Whatever technique to be used.

Yes, It requires a kind of other thinking but as the series are getting longer and longer (I mean for meteo it already surpasses easily 12 years and I know of one who has more than 20) If you do that with AQ data too you have to get more intelligent than just display everything. Even fetching 20 years of data in a graph of all conc. and AQI data would be quite a task.

So when looking to the future, assuming series gathered will be passed on to next generations of weather enthusiasts, we have to become more intelligent anyway.
Hans

https://meteo-wagenborgen.nl
CMX build 4017+ ● RPi 3B+ ● Raspbian Linux 6.1.21-v7+ armv7l ● dotnet 8.0.3
Post Reply