My first post for ReadWriteWeb (nearly a year ago) started with the premise that search was “game over”, that Google had won and the only opportunity left was (re)search – i.e. what one does after the basic search. Unfortunately, none of the search start-ups since then has made a dent in Google’s relentless march towards search market dominance. In this article, we outline 11 search trends that may change that.
The proposition that launched countless search start-ups was: “If we can get just 1% of the search market, we will have a very valuable business”. That may be true, but getting 1% has proved elusive. It has been an all or nothing game.
That may be about to change.
It is possible that Google will not be beaten by one big competitor. It is possible that they will be pecked at by thousands of tiny start-ups using a new outsourced infrastructure.
But before getting to that punchline, here is my 11 point recap of the search market:
1. Disambiguation is (still) not enough motivation to switch. All those learned PhDs with backgrounds in natural language search and AI explaining that the words “paris” and “apple” have multiple meanings that Google cannot parse from a single search, massively miss the point. The average user has figured that out and either enters multiple words or refines the search based on the first search. Using natural language search – which is complex to code and expensive to process – is a classic “hammer to crack a nut” solution.
2. Webmaster push-back and basic economics will accelerate the trend towards an outsourced crawler market. Webmasters won’t accept a proliferation of crawlers as some of them maybe malicious and all of them impact performance to some degree. Google Yahoo Microsoft (GYM) will always be accepted as they drive enough SEO, but marginal crawlers will struggle. Basic economics mean that only a very small number of players will be able to afford the giant server farms needed to index the whole Web. The YM parts of GYM (as well as Amazon) will increasingly offer their infrastructure to anybody who can build value on top.
3. Yahoo Search Monkey may have arisen from desperation, but we may also be witnessing a “Linus moment”. SearchMonkey is the most well-defined entry into the outsourced crawler market. It comes from their recognition that it is too late to beat Google in a head to head battle, so it could be dismissed as a sign of desperation. However I prefer to see it as a “Linus moment”, that point in time when Linus Torvalds simply said “here is what I have done so far, anybody who can take it to the next step is welcome to try”. To be truly disruptive, Yahoo may need to open this up even more than they have to date.
4. There will be many more attempts to monetize Wikipedia. Well-funded search ventures such as Powerset have retreated to the much narrower goal of searching Wikipedkia. Freebase also uses Wikipedia as the their core data. Walking around the RPI Web Science Research Initiative, I could see many interesting R&D experiments coming out of Academia all of which used Wikipedia as a base. Wikipedia has just enough structure and normalization to be useful. Above all, the History feature makes “data provenance” possible and that is critical for trust.
5. Core search is still getting funded. This is not what one would expect in what is by any definition a consolidated market with one mighty big gorilla sitting on top. Look at Blekko getting $2m without even a prototype to show the world. Are the investor’s nuts? Possibly, but they include some pretty smart guys like Marc Andreessen and the founder Rich Skrenta is clearly a smart guy (his Blog is a good read). Or look at Cuill, which got $25m as recently as April. Maybe they are idealists tilting at windmills. Maybe they know something that the rest of us don’t. Only time will tell. These new entrants will eschew any hype, which they know has not one single point of value in adoption.
6. Image search is another “hammer to crack a nut”. Searching images, video and audio is one of those “non-trivial” computer science projects that great engineers love to tackle. However great investors should steer clear. It is hard to code and incredibly expensive to process. The competition is tagging (see next point) which is classic “just good enough and improving all the time at virtually no cost” that is impossible to beat.
7. Tagging is quietly but massively disruptive. The fact that thousands of webmasters and bloggers tag their content so that they can be found by Google is Google’s secret weapon. But it could get turned against them. A small incentive to be found by other search engines will change tagging behavior. This is likely to play out in lots of vertical niches, where a small change in tagging behavior can make a huge difference in findability and that can make a big difference to both buyers and sellers. Whether people use RDF or Microformats or some other defacto vertical standard will continue to be the subject of much debate, but the format itself is not the issue. The human drive to tag (to order one’s world) is deep and strong and has financial motivations as well.
8. Whitelist is a good way to kill spam. Spam is the big problem for search as well as email and whitelists work well for both. In search this is done by a site that uses something like Google Custom Search Engine (or Search Monkey) to define what sites to search within a defined domain. Even if that means defining 1,000 sites and adding new ones every day, that is well within the range that a single human curator can do within a single market domain. The human curator deletes any spam sites manually.
9. P2P search could still be a long-term disrupter and Microsoft’s route back to relevance. The only way to do search without putting all the Web’s pages into one server farm is via P2P. I have written about Faroo’s attempt here. It relies on .Net and this maybe Microsoft’s card to play but only if Vista gets real traction. This is a real long shot, but an intriguing one.
10. There is tons of great data inside relational databases that is quite easy to search. It is the HTML layer that is getting in the way. As more sites learn how to expose their structured, relational databases as Web Services APIs, a lot more data will be available that does not rely on word search on HTML pages.
11. It’s the Adwords, stupid! All the search wizardry don’t matter a hoot if the monetization is not done right. There is plenty of motivation out there. Sellers want cheaper search words to buy. Publishers want a bigger piece of the cake. Buyers/searchers may even want cash back (we will see if Microsoft’s crude tactic, lambasted in the Blogosphere, makes it in the real world).
Conclusion
Most of these trends point in the direction of search as infrastructure feeding thousands of innovators in niche markets – a long tail approach, in other words. Google will play in this infrastructure game – they already do with Google Custom Search – but it is vendors such as Yahoo, Microsoft and Amazon with equally deep pockets and much more to lose from total Google dominance, who will be the disrupting innovators in this next phase of the search market.
Image credit: davemc500hats