XSLT and HTML 5 problems
by Pascal Opitz on December 9 2008, 17:47
Sometimes I'm really getting annoyed about the lack of control that XSLT sometimes gives about what target formats are supported and what output it generates
I'm trying to utilize a canvas tag, and excanvas. Now the problem that I'm having is that excanvas is hooking up to onreadystatechange, and therefore will be executed before the ondomready event that jQuery offers.
Which means I have to either do inline JS, and generate the canvas tags per JS, in order to create valid HTML 4, or I have to use the HTML 5 Doctype and can write the canvas tag in there just like that.
Problem is: XSLT 1.0 doesn't support the HTML 5 to generate a doctype, and the output encoding meta tag that it selfishly applies is not valid in HTML 5 either. Any ideas anyone?
UPDATE
Quite a fruitful discussion in the comments.
So for anyone else who's reading this: Bottom line is that, even with existing technology for XSLT, it is possible to create HTML 5.
The first issue we were discussing was the DTD. HTML 5 in its current draft caters the generation with XSLT by providing a fallback DTD:
<!DOCTYPE html PUBLIC "XSLT-compat">
The other issue was the meta tag with the charset attribute, that HTML 5 introduces in order to target the character set:
<meta charset="..." />
It is just not possible to generate exactly that with libXSL, because libXSL forcefully replaces it with an HTML 4 style meta tag.
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
This is not a problem though, as the old meta tag in its encoding state is a valid declaration of the character set, too.
Comments
by Sérgio Carvalho on December 9 2008, 20:59 #
yep. pretty upsetting. This was one of the options I had in mind, that could solve my problem:
Now I gotta say, none of these options makes me happy, but I guess inline js is the most reasonable, and probably the most transparent and least hacky thing to do.
It's a bit weird though, that XSLT parsers don't give you that control over the meta tag that defines the output encoding.
I can 'hack' the doctype thing by using CDATA and xsl:text, and the doctype only comes up when it is defined in the output element. But forcefully insert the meta tag, even though the output node hasn't got encoding specified? That's just plain weird, and leaves wishes for some flag to be available in the processor, to turn that off.
I had a look at the libXSL mailing lists though, and couldn't find anything that would let me do that. XSL pros: Is there a way to fine tune libXSLT?
Finally, to whether or not it was stupid to make HTML 5 not an XML derivative: There is a serialization of HTML 5 that is XML, but is has to be served with the content type application/xhtml+xml. As with XHTML 1, some people consider it harmful, if it's served with content type text/html.
by Pascal Opitz on December 9 2008, 21:17 #
by Anup on December 10 2008, 11:09 #
Anup, the forceful generation of the meta tag, that's something libXSL does. PHP merely acts as a proxy.
Here is the output tag I used in XSL:
by Pascal Opitz on December 10 2008, 12:55 #
by Anup on December 10 2008, 17:12 #
Here's something that should help:
First of all, I couldn't believe that the doctype actively breaks XML and SGML without an alternative, and, tadaaa, there is a way for exactly this issue:
The doctype legacy string, only to be used in conjunction with xslt.
With that one down, I tried the slightly amended xslt demo from php.net (I used <xsl:output doctype-public="XSLT-compat" method="html" /> as output tag), saved the output to a file and validated it, and, apparently, everyone's happy! 2 warnings (html 5 validation being experimental and the legal legacy doctype being automatically replaced), but definitely no deal-breaker.
I can't quite see where you had an issue with the content-type. I've seen some blurb about only accepting the first content-type that comes up (which might be your servers http header, which of course I didn't have); but I don't know if the w3c validator has any beef with that.
by Matthias Willerich on December 12 2008, 11:08 #
by Pascal Opitz on December 12 2008, 11:14 #
I'm quoting mainly from this mailing list thread(ideally read the whole thing) from January 2007, where someone had exactly the same problem as you.
The w3c recommendation says:
"If there is a HEAD element, then the html output method should add a META element immediately after the start-tag of the HEAD element specifying the character encoding actually used."
the Xsltproc you're using interprets the "should" as "must". I can't find any information that would hint that this has been changed since; only several discussions about it and one suggested fix I don't quite understand. I guess all that's left is to bring this up with the makers of this library, or patch your library yourself (er, maybe not).
by Matthias Willerich on December 12 2008, 16:18 #
Interesting. I hardly ever validate by URL or file upload, but usually use the copy paste thing of the W3C validator, and rely on it sniffing the doctype.
<!DOCTYPE html PUBLIC "XSLT-compat"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>HTML 5 test</title> </head> <body> <h1>HTML 5 doctype and canvas test</h1> <div class="canvascontainer"><canvas></canvas></div> </body> </html>
Above input by source copy paste is failing, when i don't select the HTML 5 doctype from the dropdown in advanced options. DOH!
by Pascal Opitz on December 12 2008, 17:44 #
by Ian Hickson on December 12 2008, 21:06 #
by Shawn Medero on December 12 2008, 21:20 #
Thanks Ian and Shawn. I updated the article copy to reflect the discussion.
Btw: I am still wodering if according to correct HTML 5 interpretation the charset attribute is the only correct way to signify the charset in the markup, or whether the old HTML 4 style declaration will be a valid fallback?
by Pascal Opitz on December 12 2008, 23:28 #
I guess the current draft answers my question:
by Pascal Opitz on December 12 2008, 23:39 #
by Jeroen Pulles on December 18 2008, 12:01 #
Jeroen: Thanks for your input, but I think you slightly misunderstood what we were discussing about. We were merely discussing the issues that one faces when trying to generate HTML5 from XSLT. Nothing else.
I think by now most people are aware of the "XHTML as text/html is harmful" opinion, and I am sure most people will have their own point of view about this. I definitely have read it with care, and decided on the roadmap for this blog and other sites.
What kind of issues I have to be aware off when generating markup from XSLT when I want to generate XHTML is a different story.
I personally find the
<xsl:preserve-space />
element is a great help to achieve what you want. Also there's the xml:space attribute, which can be set to preserve.by Pascal Opitz on December 18 2008, 14:59 #
I know I'm a little late with this comment, but I'm using the following:
Which for me produces the html5
<!DOCTYPE html>
. I'm processing this through libxslt (via both Python and PHP).by Phillip Oldham on January 6 2011, 08:25 #