Home Google Custom Search: Setting The Bar For Vertical Search Engines

Google Custom Search: Setting The Bar For Vertical Search Engines

Google already dominates the web search market, with between approximately 55% and 65% of the market depending on who you ask.
The company’s flagship product has been responsible for its phenomenal growth and everyone knows
that Google made its fortune by tying its genius search algorithm to advertising. It is perhaps less
known, however, that the web giant has opened its search engine for use on any web site, by any service. Dubbed Google Custom Search Engine
or CSE, the product exposes the API behind the world’s most powerful search engine. Why is Google offering this API?
How can it be used? And what is the connection to vertical search? We explore the what, how, and why of Google CSE in this post.

The Basics of Google Custom Search

You can think of Google Custom Search as a filter over the main Google search engine.
This is a bit of a simplification, but a good way to initially wrap your heads around the concept. By creating
a filter, CSE allows its users to restrict search results to particular sites that match
URL patterns or keywords.

The resulting engines can be searched via API or a search box that users can place on their
web sites. The monetization strategy for Google is straightforward, and not surprisingly it is based on ads. Unless used for academic purposes, custom search engine results display contextual ads just like the regular search engine results do. Creators of custom search engines can earn a cut of the ad revenue by linking their engine to an AdSense account.

For example, you can create a custom search engine that only
searches one site. Many sites, have done that, instead of building
their own search solution. Another thing that you can do is to restrict the search
to a specific list of sites, in essence creating a vertical search engine, which we will discuss
at length below.

Custom engines can be created and managed using a simple visual interface or, for
more advanced users, an XML file. The UI version is essentially a wizard where the user
is prompted to fill in the basic information about their search engine, a list of sites for the engine to index, and to
define look and feel of the search and results pages and configure other advanced options. You can make the search engine private or have it listed
in Google’s custom search directory. Interestingly, you can invite other people to collaborate with you on
creating your search engine. The process of creating an engine takes just a few minutes, and when you’re done you get a page that looks a lot like Google itself with just a search box.

Custom Search Engine In Action

For this example we created a search engine for music reviews by telling Google Custom Search to index only sites that feature music reviews.
In a way, this is like teaching Google semantics, because the sites that we hand pick
contain mostly content for music reviews. There are two major types of sites that
we picked – music magazines and music review blogs.

We then searched for a recent album by Josh Ritter – “Historical Conquests of Josh Ritter.”
The results from CSE only have links to the album review pages:

If we were to search Google directly with exact same phrase, we would not
get just reviews. The matches there would lead to Wikipedia, the artist’s home page, and
album links at various retail sites, all mixed with the review pages. Interestingly, when we added the word ‘review‘ to the search, the results from Google were similar to the ones returned by our custom search engine.

Still, the results returned by the specialized engine were more precise and targeted.
The key to good results is a good selection of sites. The more high quality music review
sites that we add to this engine, the better it will perform. It does not need to
be a large number of sites, however. Even our initial set of 20 high-quality sites returned good results for a lot of recent
music albums.

Powering Up Vertical Search

Google Custom Search Engine is a platform for building vertical search engines.
What if the engine contained links to electronic sites, would it be close
to Retrevo? Imagine keying every active blog on the Internet into a custom search engine (there is an API, so the process
does not need to be manual). Could that yield a search engine that compares to Technorati or Google’s own Blog Search? The answer is – very likely.
Consider an example of a startup that is doing just that.

Colorado-based Lijit, allows people to search the web
through the experiences of other people. One of Lijit’s core ideas
is that each of us is an expert in a particular area. For example, Brad Feld is an expert in Venture Capitalism
and Investment. When you are looking for quality information about venture capital, it makes sense to ask Brad. Lijit’s
search engine does exactly that by searching through the various pieces of Brad Feld’s online existence, including his blog, del.icio.us
bookmarks, and Facebook profile, etc.

Behind the scenes, Lijit actually creates an instance of Google Custom Search Engine to do the search.
This engine is configured with links to blogs, social network profiles, photos, videos and everything else
that defines a person as a vertical. By leveraging Google’s infrastructure, Lijit has given themselves a huge jump start.
If they had to actually build a crawler, likely all technical efforts would be consumed doing that. Instead,
the team built on top of Google’s offering and focused on presenting the best way to
search through online personal experiences.

Vertical Search Is Reduced To UI

Lijit’s example naturally leads us to the this question: What is the impact that Google CSE
has on the vertical search space? Does it make it a commodity? Not entirely, but it does
commoditize the infrastructure. There is no longer any need to build custom crawler. Crawling and indexing web sites and other online information is a huge problem that requires a lot of resources, and even if you have them, there exists a very real
chance of not being able to get it right. Look at Microsoft — they still can’t crack it.

So if the infrastructure problem is solved, the innovation is pushed up to the UI level.
How the results are presented is what can make a difference. For example, Retrevo further
clusters results on their vertical search engine into different categories, and distinguishes reviews, product manuals, etc.
It adds semantical understanding not only to the filtering of the underlying sites, but also
to the presentation of the results. Given that filtering can be done using Google CSE, the
innovation is basically in the presentation of the results.

Conclusion

Google CSE is an interesting piece of web infrastructure. On one hand, it simply
opens up a different use for Google’s core technology. On the other hand, though, it commoditizes
the backend of any vertical search engine. However, we think that it’s more of a blessing than a problem
for the vertical search players, as they can now focus on their core specialty – presentation
of the results in the given domain.

Please share with us interesting examples of Google CSE that you’ve seen online and tell us your thoughts about what Google CSE means for the vertical search space.

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the tech industry for major developments, new product launches, AI breakthroughs, video game releases and other newsworthy events. Editors assign relevant stories to staff writers or freelance contributors with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Get the biggest tech headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Tech News

    Explore the latest in tech with our Tech News. We cut through the noise for concise, relevant updates, keeping you informed about the rapidly evolving tech landscape with curated content that separates signal from noise.

    In-Depth Tech Stories

    Explore tech impact in In-Depth Stories. Narrative data journalism offers comprehensive analyses, revealing stories behind data. Understand industry trends for a deeper perspective on tech's intricate relationships with society.

    Expert Reviews

    Empower decisions with Expert Reviews, merging industry expertise and insightful analysis. Delve into tech intricacies, get the best deals, and stay ahead with our trustworthy guide to navigating the ever-changing tech market.