TechCrunch and Search Engine Land are reporting this morning that Yahoo! will now be indexing Semantic Web and Microformats markup from around the web and will use that information to display more structured search results. Here is the Yahoo! post about the news.
We asked last month how vulnerable Google is in search and the leveraging of standards-based structured data may be the most obvious approach to improving on the search industry's current best practices. As Tim Berners-Lee said just weeks ago the time for the semantic web is now.
What Does This Mean?
Here's one example of what that could mean: Today, a web service might work very hard to scour the internet to discover all the book reviews written on various sites, by friends of mine, who live in Europe. That would be so hard that no one would probably try it. The suite of technologies Yahoo! is moving to support will make such searches trivial. Once publishers start including things like hReview, FOAF and geoRSS in their content then Yahoo!, and other sites leveraging Yahoo! search results, will be able to ask easily what it is we want to do with those book reviews. Say hello to a new level of innovation.
This has been really geeky stuff for a long time, with little market traction and a whole lot of promises from academic research and outlying innovators. That will now change.
The basic idea behind Semantic Web technology is that by signaling what kind of content you are publishing on an item-by-item or field-by-field basis, publishers can help make the meaning of their text readable by machines. If machines are able to determine the meaning of the content on a page, then our human brains don't have to waste time determining, for example, which search results go beyond containing our keywords and actually mean what we are looking for.
Publishers will now be able to clearly designate content on a page as related to other particular content, as business card type information, as a calendar event, a review or as many other types of content. It will make Yahoo! a lot smarter and should shake up the world of Search Engine Optimization and web publishing, a lot.
Who Does the Markup?
Many observers of the Semantic Web, including us at times, have argued that it's unrealistic to expect web publishers to markup their own content and that a more realistic path to market for technologies based on semantics is to build applications that can parse the semantics out of other peoples' content from outside.
In my interview with Mark Zuckerberg last week, for example, the Facebook CEO expressed disinterest in participating in the Semantic Web. I didn't publish it in the interview, but he indicated such a move would be up to a third party site organizing information via the Facebook Platform if it was going to happen at all. He will probably change his tune now, as adding hCard support to Facebook public profiles will now be a no-brainer. Other publishers will be faced with similar questions.
Semantic web markup will quickly become standard practice though for all CMS/publishing systems and we'll wonder what we ever did without it or why it seemed so hard.
Google Will Soon Follow
This move by Yahoo! will likely be followed up by Google, it's just too much opportunity for any search engine to pass up. Semantic markup is like a content-level site map, something all the search engines have agreed on a standard for already. Semantic web technology is next. There will be big job opportunities, more than there are for SEO in the short term, for people who can help publishers implement Semantic Web markup retroactively and into the future.
The Semantic Web was one of a handful of topics that we identified as key themes for the coming year in our RWW Toolkit for 2008. Check that toolkit out for resources you can use to follow this important topic as it unfolds.