Tristan Louis is a colleague and insightful analyst. Over the weekend, he took a look at the top 20 sites according to Alexa and ran them through the W3C HTML validator to see who is playing by the rules and who still has some catching up to do.
Surprisingly, MSN.com was the sole site among the top 20 to completely pass, and Amazon had the most page errors – more than 500 of them with more than 100 particular warnings – “showing that disregard for standard compliance does not seem to have an impact on economic performance,” he says in his blog post.
Most of the top 20 sites have adopted the UTF-8 encoding type that supports multiple languages by default.
While the W3C validator isn’t the last word (or even the first word) when it comes to HTML5 accuracy (as we have covered before here), it is an interesting comparative metric.
Louis then went on to examine the code of many top Web 2.0 companies to see how they compared. All of them are using UTF-8, and all of them had errors with the validator. Only five out of the 11 sites have made the transition to HTML5, with the rest using XHTML or HTML v4. As he says, “It looks like there is still much room for improvement in the world of HTML validation.”
Note: For Scott Fulton’s take on this issue and further discussion with Louis over HTML5 validation, see his article here