Processing the output buffer with XSLT
by Pascal Opitz on July 24 2005, 18:07
The output buffer
Most programmers dealing with PHP will have come across various PHP errors when trying to do a redirect after an echo or something similar.
The error usually looks like this:
Warning: Cannot add header information - headers already sent by (output started at /directory/to/starting_file.php:XXX) in /directory/to/calling_file.php on line XX
And for many of you that will have been the only application for the function ob_start which immediately fixes exactly these errors. But most ignore that ob_start is just one function of a whole toolkit of functions that are referred to as “Output Control Functions”, which provide a sophisticated toolkit for controlling and manipulating the output generated by PHP.
The callback function
The most powerful bit in this set of functions is definitely ob_start and it's optional parameter, the callback function. This callback function will be called when the output is finally thrown. Using this it's easy to generate output and, for example, clean it afterwards with HTML tidy, escape it, replace parts of it or replace all of it.
To show what I mean I'll provide a little class-based script as an example:
<? class examplePage { function examplePage() { ob_start(array($this,'parseOutput')); echo $this->getExampleXML(); } function parseOutput() { $str = "<pre>" . htmlentities(ob_get_contents()) . "</pre> is the XML string we get from getExampleXML()"; return $str; } function getExampleXML() { $str = "<root><test>Teststring</test></root>"; return $str; } } $example = new examplePage(); ?>
As you can see the content thrown by the echo is parsed afterwards by the parseOutput method, and stuff gets added and escaped in one go.
Layered applications
This alone is a very powerful tool that can be used in pretty much every application that generates output with PHP, but we can push it one step further.
We'll use XML as an intermediate application layer. The callback function will then process the whole output and render it through an XSL transformation.
<? class examplePage { function examplePage() { ob_start(array($this,'parseOutput')); echo $this->getExampleXML(); } function parseOutput() { $this->xslt = xslt_create(); $this->arguments['/_xml'] = ob_get_contents(); $this->xmlDoc = 'arg:/_xml'; $this->arguments['/_xsl'] = $this->getExampleXSL(); $this->xslDoc = 'arg:/_xsl'; return xslt_process($this->xslt, $this->xmlDoc, $this->xslDoc, NULL, $this->arguments); } function getExampleXML() { $str = "<root><test>Teststring</test></root>"; return $str; } function getExampleXSL() { $str = '<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> Test: <xsl:value-of select="//test" /> </xsl:template> </xsl:stylesheet> '; return $str; } } $example = new examplePage(); ?>
And here we go - dynamic processing of the XML-based application output. This is obviously a raw example, and it needs integration in whatever framework you use, but hopefully you can see the power and flexibility of this technique.
Outlook
So what could this be useful for?
In my opinion this could give some web applications a whole new twist. One possibility for the techniques described would be to separate the presentation-related rendering process into the step after the output. While your application is built to render XML and throw that into the PHP output, a separate method, maybe even a separate class, could handle this output and transform it into the right format.
The advantages are immediately obvious. Output rendering would become a reusable module and without it the application would still output W3C-compliant XML code (if you did everything right, that is).
And again, this is just one possibility to use the callback function. Together with regular expressions or applications like Tidy you could ensure that the output of dynamic data is valid. This could be useful for all people who use variables to pass html-content into XSL templates.
Comments
by Mike Stenhouse on July 24 2005, 18:08 #
by Pascal Opitz on July 25 2005, 01:28 #
by Mike Stenhouse on July 25 2005, 17:02 #
ob_implicit_flush(true);
Will automatically flush any echo you do in that script.
Be aware there are lots of well documented bugs with browsers and flush – You sometimes have to pad the first echo with some blank data so that it forces the browser to display.
http://uk2.php.net/flush
by sermad on September 16 2005, 05:29 #
The goal is to provide an HTML interface while using class to be able to talk to with the JABBER’s protocol…
by Toucouleur on February 12 2006, 14:25 #
Thanks for the props btw, but right now my head is spinning with other things and I have no time at all to write anything. But I’ll be back on track soon if I can come up with anything smart enough :)
by Pascal Opitz on June 19 2006, 18:21 #
That is, the method called by the output buffer is the View layer, Model is model (as normal), but the controller part is simply Apache doing what it normally does best. None of this silly single point of access nonsense!
So, for example, a request to ben.com/res/products.php with an Accept-Type of “text/html” renders the output as HTML! I have even taken this a step further and used it to include images (using the Accept-Type to select the most appropriate output type), convert model data to XML, which is then XSL to SVG, which is then run through ImageMagick Convert.
Apache should be the Controller.
P.S. Love this article! More please!
by Ben Davies on June 19 2006, 11:46 #
Great for things like tables of data: request as HTML, get a HTML file with a table of data, or request a PNG and get a nice PNG graph of the same data. The exact same code is called to collect and process the data for both calls.
Seriously, seperating the View layer using the output buffer is really really inspired! Dude, I’m suprised no one else has thought of this, it reall does give you a nice clean seperation.
Looking forward to more :)
by Ben Davies on June 20 2006, 06:33 #
by Rakshi on September 22 2006, 04:12 #