Page 1 of 2

Separators in Custom Logs

Posted: Fri 03 Feb 2023 3:57 pm
by HansR
Would it be possible to force the separators to be the ones used in the other logfiles?

Currently it is a users choice while it would be easy to change the users choice into the one forced by the locale: the separator after date and hour are already that, the separators between the webtags can be easily replaced (e.g. comma to semicolon). Now it is only governed by the helptext

Code: Select all

It is important to use the same data separator as CMX uses for the other data files.
but the user if free to do as he likes.

It would avoid a lot of error and confusion.

Re: Separators in Custom Logs

Posted: Fri 03 Feb 2023 6:10 pm
by mcrossley
The problem is there are many separators depending on locale. If you can produce a foolproof way of parsing text for them then I'd be happy to implement it.

Re: Separators in Custom Logs

Posted: Fri 03 Feb 2023 6:20 pm
by freddie
mcrossley wrote: Fri 03 Feb 2023 6:10 pm The problem is there are many separators depending on locale. If you can produce a foolproof way of parsing text for them then I'd be happy to implement it.
Have a dynamic parser token regex based on locale?

Re: Separators in Custom Logs

Posted: Fri 03 Feb 2023 6:55 pm
by mcrossley
The problem is the users define all the text and separator apart from the date/time. If they use the locale separator there isn't an issue.

Re: Separators in Custom Logs

Posted: Fri 03 Feb 2023 7:20 pm
by HansR
mcrossley wrote: Fri 03 Feb 2023 6:10 pm The problem is there are many separators depending on locale. If you can produce a foolproof way of parsing text for them then I'd be happy to implement it.
But how do you determine the separator (between the fields) in the standard and daily logfile?
That is exactly the one you should use in the custom logs isn't is? I don't care about ll other possible separators, just the one already selected and used by CMX.

Why should you leave that choice of separator to the user?

Re: Separators in Custom Logs

Posted: Fri 03 Feb 2023 7:40 pm
by freddie
HansR wrote: Fri 03 Feb 2023 7:20 pm
mcrossley wrote: Fri 03 Feb 2023 6:10 pm The problem is there are many separators depending on locale. If you can produce a foolproof way of parsing text for them then I'd be happy to implement it.
But how do you determine the separator (between the fields) in the standard and daily logfile?
That is exactly the one you should use in the custom logs isn't is? I don't care about ll other possible separators, just the one already selected and used by CMX.

Why should you leave that choice of separator to the user?
That's what I was getting at in my own clumsy way. I thought all separators were based on locale - but what I was getting at was that there are recognised separators for the standard logs, so they should be used for consistency.

Re: Separators in Custom Logs

Posted: Fri 03 Feb 2023 10:16 pm
by mcrossley
The separators under my control, the first two between the date and time are the same as used everywhere else.

The rest of the line is entered by the user, I have no control over that, it is free text. It can contain whatever the user wants; normally web tags separated by characters the help instructs you make the same as your other files. But if could contain fields that are text.

The only way I could control that would be by providing separate fields and the user has to enter the text for every column into each field individually. That would be very cumbersome and potentially use vast amounts of screen real estate.

Do we have a problem here? Have users been creating custom log files with separators different from the locale default? I have not spotted any related posts.

Re: Separators in Custom Logs

Posted: Sat 04 Feb 2023 12:33 am
by HansR
mcrossley wrote: Fri 03 Feb 2023 10:16 pm The separators under my control, the first two between the date and time are the same as used everywhere else.

The rest of the line is entered by the user, I have no control over that, it is free text. It can contain whatever the user wants; normally web tags separated by characters the help instructs you make the same as your other files. But if could contain fields that are text.

The only way I could control that would be by providing separate fields and the user has to enter the text for every column into each field individually. That would be very cumbersome and potentially use vast amounts of screen real estate.
I would think that once the user entered the string, before storing it it would be easy to check they are valid webtags and in the same process check the separator is the one used for the date and time (being the one elsewhere in the standard logs).

The user info is stored as one string in the inifile, not as fields, so I don't understand what you say about fields.
mcrossley wrote: Fri 03 Feb 2023 10:16 pm Do we have a problem here? Have users been creating custom log files with separators different from the locale default? I have not spotted any related posts.
Well, not sure if 'we' have a problem, but I have.

As it is, there is the possibility not only to have different separators in the string, but also in time if the user changes them along the way. I don't know what technique you use (or expect to be used) for reading the file, but at some point the content parameter needs to be split along the separators to isolate the webtags for further processing and interpretation. As you say, it is free text, making that exercise a punishment to handle it generically.

If you make the separator consistent (or demanding no separator at all) inserting it by CMX before storing, would make life a little lighter, bearable almost.

What could be the reason to give the user so much freedom here? Even totally free text without webtags would be legal?

I give a legal example here:
My tag is:

Code: Select all

IntervalEnabled1=1
IntervalFilename1=press
IntervalContent1=Humpy dumpy test without tags and now two at the end <#tempidally> <#hum> point.
IntervalIdx1=4
Leading to the following output:

Code: Select all

03-02-23;17:15;7.1,84,0.00
03-02-23;17:20;7.1,84,0.00
03-02-23;17:25;7.2,83,0.00
04-02-23;01:30;Humpy dumpy test without tags and now two at the end  92 point.
What could be the argument to permit this?

[EDIT:]Reading this post again: it might be just as easy to select the syntactically legal webtags from the garbage without thinking about the separators at all. That leaves just the question: why such freedom?

Re: Separators in Custom Logs

Posted: Sat 04 Feb 2023 11:20 am
by SamiS
I see the point that Hans is making, but there is an another way of looking this. In my mind custom log user should basically be able to do ”anything that his/her usecase requires” or then the log isn’t very custom anymore. If we have come to a guesstimate of maybe even 2000 CMX users, we have no idea how creatively custom logs are or aren’t used.

Disclaimer: I have not yet used custom logs so my knowledge of their possibilities is thin. I just felt the need to say out loud that things are not always so straightforward.

Re: Separators in Custom Logs

Posted: Sat 04 Feb 2023 11:23 am
by mcrossley
Drafted whilst the @SamiS post came in...

They could want to include field descriptions, current conditions, units, quoted, not quoted, escape characters - people do all sorts!
I guess we could limit to only using web tags, which would make things slightly simpler - but that seems overly restrictive

As I say I am happy to implement a check, but please produce me a parser that works for all separators.

The idea of separate fields in the user input is so I could concatenate them using the locale separator before saving.

For the log files I do check for separator in use, but the first couple of fields are a known number of characters, so checking what is between them is easy. For the custom log the width of each field is unknown.

Devising a regex for csv is hard - do a search - for example (and this only accepts comma separators!)...

Code: Select all

re_valid = r"""
# Validate a CSV string having single, double or un-quoted values.
^                                   # Anchor to start of string.
\s*                                 # Allow whitespace before value.
(?:                                 # Group for value alternatives.
  '[^'\\]*(?:\\[\S\s][^'\\]*)*'     # Either Single quoted string,
| "[^"\\]*(?:\\[\S\s][^"\\]*)*"     # or Double quoted string,
| [^,'"\s\\]*(?:\s+[^,'"\s\\]+)*    # or Non-comma, non-quote stuff.
)                                   # End group of value alternatives.
\s*                                 # Allow whitespace after value.
(?:                                 # Zero or more additional values
  ,                                 # Values separated by a comma.
  \s*                               # Allow whitespace before value.
  (?:                               # Group for value alternatives.
    '[^'\\]*(?:\\[\S\s][^'\\]*)*'   # Either Single quoted string,
  | "[^"\\]*(?:\\[\S\s][^"\\]*)*"   # or Double quoted string,
  | [^,'"\s\\]*(?:\s+[^,'"\s\\]+)*  # or Non-comma, non-quote stuff.
  )                                 # End group of value alternatives.
  \s*                               # Allow whitespace after value.
)*                                  # Zero or more additional values
$                                   # Anchor to end of string.
"""

Re: Separators in Custom Logs

Posted: Sat 04 Feb 2023 11:35 am
by mcrossley
Nice quote...
I am the author of jquery-CSV, the only javascript based, fully RFC-compliant, CSV parser in the world. I have spent months tackling this problem, speaking with many intelligent people, and trying a ton if different implementations including 3 full rewrites of the core parser engine.

Re: Separators in Custom Logs

Posted: Sat 04 Feb 2023 11:59 am
by HansR
OK. Thank you @SamiS and @mcrossley for some insight and opening up a csv/javascript world unknown to me, as I really never intended to open a csv file through javascript. However, this exchange of thought let me to believe that my narrow perception of the logfile is indeed wrong. So I will accept anything in the user definition and will just extract the webtags (i.e. <#...> kind of character sequences) and use those for my purpose, leaving the rest in the bin.

And thanks Mark for that interesting world called jQuery-csv. It really is programming history (another quote of those days: jQuery-csv is an artifact of a simpler time (ie 2012) when the JS library ecosystem was still very underdeveloped.). I must have been sleeping while it was happening.

Thanks for this enlightenment, but I propose to leave everything the way it is 8-)

Re: Separators in Custom Logs

Posted: Sat 04 Feb 2023 11:59 am
by mcrossley
How about a bit more help text for now, something like...
Screenshot 2023-02-04 115816.png

Re: Separators in Custom Logs

Posted: Sat 04 Feb 2023 1:06 pm
by HansR
That would clarify more, yes.

Re: Separators in Custom Logs

Posted: Sat 04 Feb 2023 3:24 pm
by access-mdb
At the risk of being pedantic (who, me?) I think you mean postpend rather than prepend. You've added the date to the end of the string. The pedantry is that you could have used append, which means/meant add at the end, but has drifted to mean just added.
I've just Googled this and these are neologisms and can generate some heat (split infinitives anyone).
</pedant>