Page 1 of 2
Monthly detail log file
Posted: Mon 23 Aug 2010 1:27 am
by serowe
I kow this is in the C1 foirum but it relates to this version and possibly C2.
Can consideration be given to breaking up the detailed monthly log file from the way it is currently being written to disk? For instance, this weekend, the size of the file has now exceeded 2Mb and, at this size, causes problems if you are importing it in any way to a SQL database.
Whether this is broken by size or even on a daily basis (ie each day has its own detailed log) - but the continued increase in size is goig to present a few more problems before the end of the month.
At this stage I have had to break the file around Aug 18th to ensure a second part foesn't go ver the MySQL limit of 2 Mb.
Re: Monthly detail log file
Posted: Mon 23 Aug 2010 9:12 am
by steve
No chance at all, I'm afraid. Changing the way the data is stored would mean a rewrite of large areas of Cumulus. 2MB is tiny. What problem does going over that size cause you with MySQL?
Re: Monthly detail log file
Posted: Mon 23 Aug 2010 10:45 am
by serowe
MySQL fails with an internal error (500) when trying to import files that are too large. I'm trying to look at away of automating splitting the file, but the default installation of MySQL starting barfing on Sunday's file. I can go and modify my installation of MySQl (because it is hosted on my own server) but anyone who tries running it on a hosted server may have problems.
Re: Monthly detail log file
Posted: Mon 23 Aug 2010 10:56 am
by mcrossley
You could always write a script (local or PHP) to parse the file and insert the data into the appropriate table. No file size limits then? That is how I import into MySQL - but my file size is still small.
Re: Monthly detail log file
Posted: Mon 23 Aug 2010 10:59 am
by steve
This sounds to me like a problem with your MySQL import code rather than anything to do with Cumulus. How are you actually doing the import? You can't just be feeding a Cumulus log file in, there has to be something generating SQL statements. Presumably that "something" is trying to do a whole file at once? The 2MB limit you mention is presumably the maximum size of an SQL command, rather than anything to do with the size of the Cumulus files?
Edit: Or does MySQL itself actually have an import facility where it parses a text file and creates table entries from that?
Re: Monthly detail log file
Posted: Mon 23 Aug 2010 11:07 am
by serowe
mcrossley suggests a way which is what I was thinking of doing.
The import script is the one daj posted here a while ago - it has been working every day until today (I was away over the weekend so the last day I actually imported was Friday).
I thought maybe a bug in the script but when I split the file (I broke it up so 1st to 18th in parts 1 and 19th to current in part 2) both parts import without any problem. So based on that I believe I can reasonably safely discount the script itself (and a note on that script - this is the one I re-wrote to include the FreeWx data files and had a similar prolem with some of them as they routinely reached 1.8 to 2 Mb but the ones over 2.1 Mb all failed and I had to split to import although, because they only had to be imported once - unlike the Cumulus logs - I was OK with the woprk of splitting them).
The script reads the file and places each line record into an array element and this is then processed as a REPLACE statement.
Re: Monthly detail log file
Posted: Mon 23 Aug 2010 11:20 am
by steve
serowe wrote:The script reads the file and places each line record into an array element and this is then processed as a REPLACE statement.
Right - so it
is the limit on the size of an SQL statement. It seems to me that it wouldn't be too hard to modify the script to break the statements up to keep the size down.
Having looked at the script, unless I am misunderstanding, this script is already the sort of thing that Mark is suggesting. It just needs modifying to cope with larger files.
Re: Monthly detail log file
Posted: Mon 23 Aug 2010 11:27 am
by serowe
Well yes - and no. The SQL statement is processed record by record as it reads through the array - so the statement is built up as a variable thus:
Code: Select all
$StartOfReplaceSQL="REPLACE INTO " . $table_name . " (LogDate,HighWindGust,HWindGBear,THWindG,MinTemp,TMinTemp,MaxTemp,TMaxTemp,MinPress,TMinPress,MaxPress,TMaxPress,MaxRainRate,TMaxRR,TotRainFall,AvgTemp,TotWindRun,HighAvgWSpeed,THAvgWSpeed,MaxHumid,TMaxHumid,MinHumid,TMinHumid,MaxDewPoint,TMaxDewPoint,MinDewPoint,TMinDewPoint,MaxIndoor,TMaxIndoor,MinIndoor,TMinIndoor,MinWindChill,TMinWindChill,MaxHeatIndex,TMaxHeatIndex)";
This bit is no problem
The script then replaces the values into each array element in a WHILE loop as it goes from record to record, and then each record is replaced with:
Code: Select all
if (! mysql_query($insert))
{
echo "<br />failed to insert data -- " . $insert . "<br />";
}
then the code loops back, goes to the next array element, repalces it with the actual data in the SQL statement, stres it, lopps, replaces, stores and so on. So from a SQL side the statements are actually quite small, but the file it is READING the data from is getting larger and larger every day.
I can turn off the mysql_qury($insert) statements, replace them with echo's of the data it would be inserting, and the script runs to completion without any errors; it is only when I try to process the 2+Mb file into the table that it fails - but, again,. only if the file is processed as a whole. Split, no problems, single/complete file and it fails.
Re: Monthly detail log file
Posted: Mon 23 Aug 2010 11:52 am
by steve
Right, I understand - so it is doing a separate REPLACE for each entry in the file. But after a certain number of REPLACE commands have been sent, it falls over. I wonder if doing mysql_unbuffered_query() instead of mysql_query() would help. But I know very little about MySQL so I'm just guessing.
Re: Monthly detail log file
Posted: Mon 23 Aug 2010 12:05 pm
by serowe
Have to be honest I am still trying to get my head around SQL (give me DOS programs anyday!

) And yes - something like this is another concept I have read about and will have to look at.
Back to the original question though - what will C2 be doing in terms of log files? I know you have mentioned using SQL, but will the data be readily accissible inside a SQL database (that is, older, historical data) or will it be written to outside log files like the daily and monthly files currently are?
Re: Monthly detail log file
Posted: Mon 23 Aug 2010 12:07 pm
by mcrossley
I've not had a look in detail, but is a commit size limit you are hitting? Does the script commit all the inserts at the end - or are the inserts commited as it goes along?
Re: Monthly detail log file
Posted: Mon 23 Aug 2010 12:11 pm
by steve
The data in C2 will all be in an SQLite database - if I can only get the structure right.
But I've realised that people are doing things external to Cumulus with dayfile.txt (for example) for various web sites, so I guess I am going to have to provide an option to generate Cumulus 1 format text files as well.
Re: Monthly detail log file
Posted: Mon 23 Aug 2010 12:13 pm
by steve
mcrossley wrote:I've not had a look in detail, but is a commit size limit you are hitting? Does the script commit all the inserts at the end - or are the inserts commited as it goes along?
That the was the sort of thing I was trying to get at. It doesn't do an explicit "commit", it does lots of REPLACEs in mysql_query() and then just does a mysql_close() at the end - presumably this is the point where the commit gets done?
Re: Monthly detail log file
Posted: Mon 23 Aug 2010 12:23 pm
by serowe
SQL structure - may I suggest that you get YOUR structure set up the way YOU want it and let those of us who are already posting data to a SQL table handle the conversion? Reason I say this is, if for example you take the SQl tables daj is using, even he has differeing field names in use. Examples are his histeoric program has some fields prefixed with 'uk' but the fields he provided in the ImportCumulusFiles have *similar* field names but WITHOUT the 'uk' prefixes.
It would be a lot simpler for us to write a one off conversion/copy program than for you to worry about trying to please everyne (which I think I can guarantee you wouldn;t be able to do).
The only thing I would suggest is that you make provision for almost EVERYTHING you possible can and then some. eg Humidity values which are currently not included in the files now being generated but which are obviously included in data from other programs and have been asked for. This way they can then be easily imported without losing old data but don;t compromise what you are doing/saving.
After all - you (ie Cumulus) only have to create the SQL tables once.
Re the 'COMMIT' - I will admit I hadn't even considered this being an issue. My old programming (as I said is primarily DOS and very heavy use of Clipper databases) ALWAYS provided a COMMIT because of networking, OS issues and to ensure data integrity, so I always used to do this without even giving it a second thought. I'll have a look into this tomorrow and see if this is the cause of these 500 failures. It's about 2230 here so not going to start this tonight (need to veg out and watch TV for an hour or so now to 'relax' the mind

)
Re: Monthly detail log file
Posted: Mon 23 Aug 2010 12:43 pm
by mcrossley
Another possibility is a script execution time limit on your host?