Page 1 of 1
Confirm the field separator
Posted: Fri 05 Feb 2010 10:21 am
by gemini06720
steve wrote:Comma-separated means different things in different countries, sadly. The files are still called "CSV" files, even though the "C" is not strictly a comma. I guess I should say they are "list-separator separated".
Steve, are the files produced by Cumulus, such as the 'dayfile.txt' and the 'Feb10log.txt', comma delimited or can that separator change or be changed by the user's computer?
If I was to write a program and was going to output a file, I would probably use the comma as a separator as it is so common - and I would not expect that the separator be changed by anything, not even the operating software...
Thought, I understand that the separator for the date can and will be changed...
I just needed either a confirmation or a correction to my understanding of how the Cumulus files are outputted...
Re: a PHP import script for DayFile and Monthly log files
Posted: Fri 05 Feb 2010 10:26 am
by daj
Re: Confirm the field separator
Posted: Fri 05 Feb 2010 11:14 am
by gemini06720
Thank you David ... I had missed that post...
Now I know not to use Delphi to produce real 'comma delimited lists'...
Re: Confirm the field separator
Posted: Fri 05 Feb 2010 11:38 am
by steve
gemini06720 wrote:Thank you David ... I had missed that post...
That was actually the post you quoted from.
Now I know not to use Delphi to produce real 'comma delimited lists'...
http://en.wikipedia.org/wiki/Comma-separated_values
"Fields are separated by commas (although in locales where the comma is used as a decimal point, the semicolon is used instead as a delimiter, inducing some drawbacks when CSV files are exchanged e.g. between France and USA)"
Re: Confirm the field separator
Posted: Fri 05 Feb 2010 1:12 pm
by gemini06720
Thank you both for the information.
I guess a new separator standard (other than the comma or semicolon) must be established so that CSV files can be easily exchanged between all countries of the world and not be influenced by what the operating system does or does not do...

Re: Confirm the field separator
Posted: Fri 05 Feb 2010 2:56 pm
by TNETWeather
You can easily use commas in data for a CSV file if they are enclosed.
Below is perfectly usable:
"2009-01-02","22:22","
32,4","
17,6", ....
or
'2009-01-02','22:22','
32,4','
17,6', ....
BTW I have for years used a complex delimiter for flat file databases...
filed1:::field2:::file3 ..
I've never run into an issue where ::: was in any data element. Extracting the data is simple in most languages
Perl:
Code: Select all
($field1,$field2,$field3) = split(/:::/,$instring);
PHP:
Code: Select all
list($field1,$field2,$field3) = preg_split('/:::/',$instring);
Re: Confirm the field separator
Posted: Fri 05 Feb 2010 3:45 pm
by gemini06720
Kevin, thank you very much for your suggestions - I have been using the ' | ' (space,bar,space) as a delimiter/separator - do you foresee any problems finding that delimiter within any data element?
Or, maybe I should use your suggested ':::' delimiter...

Re: Confirm the field separator
Posted: Fri 05 Feb 2010 3:52 pm
by steve
I wanted the Cumulus data files to be as simple and 'standard' as possible, easily imported into software such as Excel. Excel does allow any character as delimeter, but only a single character.
Re: Confirm the field separator
Posted: Fri 05 Feb 2010 7:33 pm
by TNETWeather
gemini06720 wrote:Kevin, thank you very much for your suggestions - I have been using the ' | ' (space,bar,space) as a delimiter/separator - do you foresee any problems finding that delimiter within any data element?
Or, maybe I should use your suggested ':::' delimiter...

We are kind of talking about different things here.
Cumulus Stand Point
These should be simple. I don't think the format should change, but the format should be documented so that a remote program knows what format they are in.
I have never seen an example log file that has comma decimal formatted info in it, so I'm not really sure what they look like. I would assume (and could be wrong) that the delimiter is something other than a comma then???
I would expect anything that uses these types of logs to adjust based on what the data is rather than Cumulus.
One of the advantages of using a standard raw format regardless of what the user chooses makes this easier to deal with, but that is a Cumulus 2 thing.
Creating new files on web server
In this case, I have always used a non-standard ::: delimiter since the use of a vertical bar in the past has been problematic. Been using the ::: delimiter in server sided generated data files for 20 year. These files are NOT intended to be used in a spreadsheet though.
For something like the daylog files... I would use this over SQL anyday since there simply isn't enough data to need SQL unless you plan on doing a bunch of unique queries. The flat file is fast enought to load any any report generated could be static and generated once a day since that is how often the data changes (or is added to).
Then the data could be used anywhere php is allowed and even if SQL is not.
Re: Confirm the field separator
Posted: Fri 05 Feb 2010 7:36 pm
by steve
TNETWeather wrote:I have never seen an example log file that has comma decimal formatted info in it, so I'm not really sure what they look like. I would assume (and could be wrong) that the delimiter is something other than a comma then???
Yes - as in my Wikipedia quote above; countries which use a decimal comma tend to use a semicolon for the separator.
Re: Confirm the field separator
Posted: Sat 06 Feb 2010 12:50 am
by gemini06720
Kevin, in my reply to you, I was not writing about the changeable Cumulus delimiter format - I have been creating an all inclusive extended flat text file to replace the rather limited Cumulus produced 'realtime.txt' - the new file contains almost all the tags (except for 3 tags) produced by Cumulus.
I wanted to choose/use a delimiter/separator that would not be affected by any operating system, thus my use of the vertical bar '|' but I think I will follow your 20 years of experience and replace the bar with the triple colon.
Thanks again for your advice.