HTML Entities in XSL
This is an open discussion with 10 replies, filed under XSLT.
Search
At the moment I’ve fixed the issue in CKEditor. formatter.ckeditor.php
As far as I know only the characaters <, > and & need to be written as an entity. Other characters such as é, ï, ♫, →, ¶, ©, etc… can be used in the normal way if your encoding is UTF-8.
Correct me if I’m wrong!
I’ve had sooooo many issues with character encoding and entities with XML recently. Character encoding is definitely the black arts of the modern day.
I think, however, that it may have been more down to the outputting system and the ingestion system that had the issues. I reckon you’re right @kanduvisla, although I would always prefer to use entities myself in XML ‘just in case’…
I also used to ‘entity’ my special characters in previous CMS-es / template engines I used, so é would become &-eacute;, but since XSLT throws an error when you use &-eacute; I figured that it would ‘entity’ itself automaticly or it would output the correct header information to your browser (which I believe is the case right now).
I had a (not so) funny issue lately when a client of mine used HTML pages generated by Symphony to use as a newsletter (copy-paste the HTML to the newsletter sort of speak). But since the HTML was UTF-8, most e-mail cliënts which have other encoding (ISO-8859-1 for example), they would see question marks or wrong characters where the entities should be.
I had a (not so) funny issue lately when a client of mine used HTML pages generated by Symphony to use as a newsletter
You simply shouldn’t do this. Using desktop email clients to send HTML newsletters has too many problems. You should consider using a PHP framework to manage the sending/encoding stuff. My Email Newsletters extension uses the Swift Mailer library, for example.
Well, it wasn’t actually copy/pasting, but with an external service called Your Mailing List Provider. They can ‘fetch’ a html-page and send it. But I’ll definetly look into you extension sometime.
FWIW, I believe XSLT allows you to use entity numbers in place of names, for example “
instead of “
.
They can ‘fetch’ a html-page and send it.
Well, obviously they can’t. If your page is valid, they should simply do the encoding stuff right.
:-)
FWIW, I believe XSLT allows you to use entity numbers in place of names
This is actually the best method to use in my experience with xslt, we have a £2.2million CMS at work that had issues with ‘word’ based entities (a bit shoddy really), which was one of the issues related to character encoding I mentioned earlier, it just didn’t get it. I had plenty of question marks in my content output ;)
We’ve now standardised to UTF-8 and numbered entities, UTF-16 in some cases though due to the wide multinational nature of our business.
Named entities can cause headaches with XSLT & XML, here’s some links to get around those headaches
Simply Using HTML entities in XSL
Some advanced methods for using Entities and XSLT
Thank you all for your replies. For the moment I’m considering the issue solved, though I think later on I’ll just have to do that from XSL side, by adding a another utility template with all the html entities I can get, otherwise that’s just such a headache.
Create an account or sign in to comment.
I’m sorry in advance if a similar thread is already present, though I’ve searched 6 pages of this section and didn’t find anything, so here it is.
I’m new to XSLT and as far as I can tell XSLT and html entities (i.e. ‘ ’, ‘<’, ‘≤’) don’t play nice.
So, I’m wondering, how can I fix that? I know XSLT accepts entities when accessed by their respective number in a charset (xx;), so I though I could do an search-replace prior to outputing data, but may be there is a better way?