I understand what you are saying but I disagree and think it is not true (as I have shown, the final average for the period is the same by whichever calculation).mcrossley wrote: ↑Mon 09 Mar 2020 9:16 am There is a difference because the sample size for each month is slightly different - they have different numbers of days each, therefore each month should be weighted if it is exactly match the sum(all days)/count(all days) calculation - again assuming samples for all days.
It is somewhat confusing, but weighing is not required because we are not sampling the month: we are using all days and all months with the values of the day. It is only when a month or year is not finished we get discrepancies. Weighing is required when we sample a population with different ratios from the full population.
The weight of the month is irrelevant because we are dealing with a daily estimator of the average temperature. Therefore, since we have no continuous temperature measurement with a corresponding integration method to arrive at a true mean, we use discrete measurements and we use the statistical mean as an estimator for the day, which becomes a population value for the month. The mean (average) of the month is calculated in a full population of numbers (the day estimates) irrelevant whether it has 28 or 31 days. I do not see a role of the weight of a month or how even to use a weight in subsequent calculations in a full population. As I have shown, the average of two months is mathematically identical to the average of all days included. Again, we are not sampling here.
Agreed. We have an incremental dataset, we calculate up to yesterday, which is a known population of days. So in the end (at the end of the year) everything is OK.mcrossley wrote: ↑Mon 09 Mar 2020 9:16 am With Cumulus we cannot assume that we have a full data set - indeed for the current year we never will, so I think the only sensible approach is to average by day rather than month. Best we can do, and there is so much annual variation I don't think it matters too much in the scheme of things anyway, maybe once we have been running Cumulus for 200 years...!
(But what if I am on holiday and my station fails, me coming back only in three weeks are those 20 days taken into the calculation? Never mind )