XSLT and HTML5
This is an open discussion with 175 replies, filed under XSLT.
Search
It's the other way round -- I'd send a pull request to bauhouse, so the "original" repo would be up-to-date as soon as he pulls it in.
I still believe we should come up with some way of listing prefered/latest extensions.
There's an ongoing discussion in the working groups. I think there will be solutions in the near future to somehow "classify" extensions. But we can't, of course, pervert the benefits of distributed source code management. There will always be forks of extensions, and it's good if they are publically available. But I would always use the "original" repo unless there is an explicit reason not to do so.
Well, we should really call Nick's repo the "original", I think. But if Nick wants to abandon it, I'll pick it up. Now that the extension has been updated with michael-e's change and it has a README file, it might be worth adding this extension to the Downloads section. But I didn't want to do it, since Nick is the creator of the extension.
Before I tag this version as 1.2.3, does anyone know whether this extension works with anything lower than 2.0.7?
On a side note: How are people testing extensions for earlier versions of Symphony? It seems like a time sink as an extension developer to have every version of Symphony installed to test extensions against each version.
It should work on any version that has the appropriate delegate, which should be all versions.
On a side note: How are people testing extensions for earlier versions of Symphony
I don't usually retrospectively test, so leave it as "Unknown", unless I know I'm using delegates or accessors that I know do not exist in previous versions, in which case I mark a "No".
I don't mind who owns it really — this is going to be a more important topic well into the future, so whoever is responsible needs to maintain it actively.
This might be a foolish question, but what should the xsl:output be for this to work correctly?
Right now, I'm using this: <xsl:output method="xml" doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" omit-xml-declaration="yes" encoding="UTF-8" indent="yes" />
But the <html>
element isn't being replaced. I assume this is because there isn't a lang="en" or similar, so it's not actually finding the right string?
Thanks, Nick. I've updated the requirements to refer to Symphony 2 in the README file and tagged the extension as version 1.2.3.
I'm happy to own the extension, if you like, since I'm actively using it and expect to be maintaining it.
@XBleed, this is the xsl:output
I am using:
<xsl:output method="xml" doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" omit-xml-declaration="yes" encoding="UTF-8" indent="yes" />
And the match template starts with the following:
<xsl:template match="/"> <html lang="en"> <head>
So, if I omit the lang
attribute, you're right that the html
element doesn't get properly replaced, but still includes the xmlns
attribute.
<html xmlns="http://www.w3.org/1999/xhtml">
I guess that's a bug. :-(
Here I go actively maintaining.
Ah, manually setting lang="en"
does the trick just fine. I think that's okay, maybe just specify that in the ReadMe? Unless of course you'd prefer it to rewrite it either way.
Also, I hate to be nitpicky here, but what about being able to class <html>
without having them get overwritten? To be more specific, a couple scripts that are gaining popularity with the html5 crowd want class="no-js"
added to <html>
for fallback functionality. Would that be pretty difficult to check for?
Edit: This is great, by the way! I was just using the legacy compat xsl:output before, this solution is much better. I have indenting back! Thanks for the progress on this, everyone.
Sorry, Nick, for being on the wrong track here—I really thought that this was Stephen's extension.
Well, now it is, probably...
@XBleed, I've been struggling to remind myself how regular expressions work so I can get this right. The regex on the html
element was far too destructive. It obliterated any attributes you might have wanted to add to it.
I updated the HTML5 Doctype extension to 1.2.4 so you should be able to use classes on the html
element now. Now, all that happens is the removal of the xmlns
attribute for XHTML and the xml:lang
namespace.
Beautiful, man. Works great!
See This discussion
update: Don't click the link, you will enter a never ending thread-loop ;-)
:-)
Hi Zimmen. I am aware of the possibility to add a "legacy Compat" doctype. The problem is, however, that this will break your HTML when used with a output method xml
.
There are various issues (see this post in this thread) such as breaking 'empty' elements (e.g. textarea
and script
).
Recently I've come to think that it's probably better to simply use HTML syntax when serving documents as HTML. Sounds logical right :-) The thing is: We are so used to the XHTML syntax, and HTML5 'allows it' that it needs some getting-used-to…
By the way: the recommendation of the HTML Spec is simple: Use the syntax that matches your mime-type. If you serve the document as text/html
, use HTML. If you serve a document as application/xhtml+xml
use XHTML...
(All of us XHTML-syntax-lovers do realize we're sending 'invalid HTML' right? :-P)
I know this is an age-old Can of Worms™'sensitive' topic with a lot of strongly voiced opinions (regarding 'good coding habits/standards' etc). I don't understand all aspects of it completely (This HTML stuff is hard!) nor do I think it very beneficial to really go into the discussion re: "HTML v.s. XML on the Web" (these are some old interesting links if you want to get into it.)
However: many people (myself included) on this forum are (I believe) trying to achieve something that seems impractical. The thing is: creating, serving and managing Polyglot documents (XHTML docs parsable as both HTML and XML) is very, very, hard. See this SO thread.
Although Polyglot documents seem wonderful in theory I do not think many of us Symphonians will ever a) correctly create them and b) use them (as in: parsing HTML documents as XML). My guess is that very few of us use MathML etc. Besides the IE factor (duh): in public websites there are simply too many issues regarding user-generated content etc.
I apologise for this huge, bulky, geektastic post. Back to the topic: XHTML5 in Symphony.
The only way to use XHTML syntax in HTML 5 document seems to be to use the post-parsing hack that simply replaces the Doctype, Charset definition etc. Otherwise, as you've found out, the document is seen as X(H)ML and you will run into issues with e.g. empty elements.
My feeling is that we should probably steer away from hacking XHTML-syntax-HTML (5) into our XSL templating process and simply use HTML syntax (and therefore use output method HTML
).
This approach will probably save us quite a lot of hassle. The Indentation Issue remains, however, something I find very annoying. Maybe some Symphonians smarter than me can think of a solution.
Can't say I disagree with your pragmatism! I've enjoyed the journey we've gone through in experimenting though, we've probably covered all options by now :-)
The Indentation Issue remains, however, something I find very annoying. Maybe some Symphonians smarter than me can think of a solution.
Without post-processing I don't think there is a solution. As far as I'm aware this lies in libXSL itself — it won't indent any output when the type is html
regardless of whether you're trying to output HTML 4 or 5. Allen has explained this eloquently elsewhere (although I forget where), but the conundrum remains: how do you correctly programmatically indent when the output itself isn't valid XML? I imagine that's why libXSL doesn't even try, as it doesn't make sense to.
I've enjoyed the journey we've gone through in experimenting though, we've probably covered all options by now :-)
Me too. I've learned quite a bit. The indentation issue indeed seems obvious now. The sensible thing to do, would be to not bother with the output indentation (sniff :( ).
Another option would be post-parsing, using Tidy (?) and/or output buffering or something? All seem to be a lot of hassle with probable performance drawbacks…
I've tried but yet to perfect it. Running every page through Tidy seems like overkill just to get well-indented markup though. Maybe just enabled for debugging... but then you have a browser DOM inspector for that. So the pragmatist in me says there's no point, you may as well compress the markup as much as possible. But I agree that not having that control takes a little getting used to.
I've been reading this discussion and still can't decide which is the preferred method for using html5. I want to be able to have opengraph and other rdfa xmlns namespaces in html element. Validation is a must and would like code indention. I Have ran across a few different approaches here here and here. Which would you guys recommend for what I'm trying to do?
i don't know about the others, but my template requires the html5 doctype extension and definitely does not indent the markup.
Symphtml5 requires the html5 doctype extension as well, but it does still indent the code.
Concerning RDFa, I've looked at the HTML+ RDFa1.1 draft, but was unable to find a valid solution for the two. However, I'm definitely not an expert on the matters at hand.. From what I understand, neither solution has solved this issue.. Though, I suppose I could be wrong.
With validation and opengraph being must haves, I can only recommend not going with html5 at the moment. :/
A side note regarding the performance implications of the HTML5 Doctype extension:
I did some quick tests on a virtual server (which in this case can deliver approximately 14 pages per second, using 10 concurrent keep-alive requests), and there was no measurable slow-down (using the latest version of this extension from bauhouse's repo).
I also sent a pull request to nickdunn's repository to include bauhouse's latest commits.
Create an account or sign in to comment.
@nickdunn thanks for the swift reply. I am quite convinced I will not be able to improve your efforts so will focus on finding the best hack available ;-)
I would like to use plain HTML syntax (no
/>
) but I'd rather use nicely indented code. This means I will have to continue to useoutput method='xml'
and XHTML syntax.@michael-e yeah: this is a returning issue with Symphony Extensions in general: how to find the current or best/prefered (versions of) extensions. With the many discussions about Symphony and HTML5 there's at least 4 versions available of the post-parsing hacks and a lot of different 'Boilerplate' master templates. I still believe we should come up with some way of listing prefered/latest extensions.
FMI: what would be the result of you preparing a pull request? How would that make your version the (clear!) prefered one?