UTF-8 (and HTML 5)
Posted: Wed 23 Apr 2014 11:11 am
Some questions for the experts out there. As I understand it, we should all really be using utf-8 encoding for our web pages now, and probably HTML 5 also. When Cumulus processes a page to replace the web tags with real data, it always saves the page using ANSI encoding (approximately equivalent to iso-8859-1). This seems to come up quite often on the forum.
So, question 1 - how big an issue is this currently?
It's quite a small change for me to get Cumulus to save ALL processed files in utf-8 instead of ANSI. But if I do this, and change the standard templates to utf-8, this would presumably mean that anyone not currently using utf-8 for their own non-standard pages would have to change them, yes?
Question 2 - is this approach not acceptable?
With a bit more effort, I could provide an option to save processed files in utf-8. One setting for the 'standard' files, and a setting for each 'extra' file. I would still change the supplied standard files to utf-8, but people who are using customised version of the standard files would need the option to turn it off for standard files.
Question 3 - is doing it this way worth the extra effort?
Question 4 - should the utf-8 files have a BOM or not?
I'm thinking that I would convert the standard files to HTML 5 at the same time. Not to use any of the new facilities, just to make them compatible.
Question 5 - is this going to cause any serious issues? Everyone should be using a browser which supports HTML 5 now, yes? Particularly as the pages wouldn't be using any new stuff?
So, question 1 - how big an issue is this currently?
It's quite a small change for me to get Cumulus to save ALL processed files in utf-8 instead of ANSI. But if I do this, and change the standard templates to utf-8, this would presumably mean that anyone not currently using utf-8 for their own non-standard pages would have to change them, yes?
Question 2 - is this approach not acceptable?
With a bit more effort, I could provide an option to save processed files in utf-8. One setting for the 'standard' files, and a setting for each 'extra' file. I would still change the supplied standard files to utf-8, but people who are using customised version of the standard files would need the option to turn it off for standard files.
Question 3 - is doing it this way worth the extra effort?
Question 4 - should the utf-8 files have a BOM or not?
I'm thinking that I would convert the standard files to HTML 5 at the same time. Not to use any of the new facilities, just to make them compatible.
Question 5 - is this going to cause any serious issues? Everyone should be using a browser which supports HTML 5 now, yes? Particularly as the pages wouldn't be using any new stuff?