XHTML Validation with the W3C validator and PHP
by Pascal Opitz on January 4 2009, 12:25
Amongst other changes, I am working on getting this blog over to use application/xhtml+xml as the content type. Of course this calls for a much stricter validation before content can be put live, otherwise users will be confronted with a broken page. The W3C validator and Zend_Http_Client make validation in PHP easy.
People that remember my validation shell script from last year (cough) already know that there is a SOAP-like response format available from the w3c validator. Like other people I am unhappy that the whole thing is not really a proper SOAP endpoint, but merely the same script that returns a SOAP envelope when the post parameter 'output' is set to 'soap12'.
This is very unfortunate. I wasn't able to use Zend_Soap_Client to construct the request, since the passed parameters are wrapped into a SOAP envelope as well, which the validator doesn't interpret.
Instead I used the Zend_Http_Client to do a POST request, which works neatly, but requires processing of the SOAP response as XML document. Below is an example of a validation controller that validates a URL and handles the SOAP response:
<?php
class Admin_ValidationController extends Zend_Controller_Action
{
public function indexAction() {
$url = $this->_request->getParam('url');
$client = new Zend_Http_Client($url);
$response = $client->request();
$fragment = $response->getBody();
$params = array(
'fragment' => $fragment,
'output' => 'soap12',
);
$client = new Zend_Http_Client('http://validator.w3.org/check');
$client->setParameterPost('fragment', $fragment);
$client->setParameterPost('output', 'soap12');
$validator_response = $client->request('POST');
$soap_response = $validator_response->getBody();
$xml = new DomDocument();
@$xml->loadXML($soap_response);
$xpath = new DOMXpath($xml);
$xpath->registerNamespace("m", "http://www.w3.org/2005/10/markup-validator");
$elements = $xpath->query("//m:errorcount");
$error_str = '';
if($elements->item(0) && $elements->item(0)->nodeValue > 0) {
$errors = $xpath->query("//m:errors/m:errorlist/m:error/m:message");
foreach ($errors as $node) {
$error_str .= $node->nodeValue. "\n";
}
}
if(!empty($error_str)) {
$this->view->message = $error_str;
} else {
$this->view->message = 'Validation of ' . $url . ' passed without errors.';
}
}
No Zend_Http_Client?
For people that cannot or don't want to use Zend Framework at all (or need a facility to encode post parameters as multipart form data), maybe it's worth having a look at the cURL functions in PHP. They provide another easy interface to do HTTP and even FTP requests. A possible snippet could look like this:
$params = array(
'fragment' => '<html />',
'output' => 'soap12',
);
$url = 'http://validator.w3.org/check';
$recieved_headers = "";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $params); // multipart encoding
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_REFERER,'');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);
curl_setopt($ch, CURLOPT_TIMEOUT,30);
$recieved_headers = curl_exec($ch);
if (curl_errno($ch)) {
print curl_error($ch);
} else {
curl_close($ch);
}
echo $recieved_headers;
In this example I didn't include the handling of the SOAP response, but you can easily grab that from the previous example.
Happy validating everyone!
Comments
by Gavin McNamee on June 16 2009, 21:21 #
And before you guys are asking: yes, I am still serving as text/html as for now, but did change the DOCTYPE to XHTML 1.0 Strict. In the future I am planning to use content negotiation to serve up the right content type.
by Pascal Opitz on January 4 2009, 12:53 #
Hi Guys
How do I make my w3c validator to validate the Public website cause "Private IPs = yes" allow me to only validate local site. Any help will be appreciated
by spoko on November 11 2009, 14:17 #