Personalized Search Primer – And Google’s Approach

Guest article by Greg Linden, founder of personalized news service Findory and author of Geeking with Greg.

Google has received much attention, not all of it positive, for its efforts to personalize search.

In this article, I will briefly describe personalized search, why Google and other search engines are trying to do personalized search, the approach Google is taking toward personalized search, and other approaches to personalized search.

What is personalized search?

Personalized search is showing different search results to different people. Personalized search uses each searcher’s past behavior to try to understand intent and what is relevant to that searcher.

If I search for [java] and you search for [java], and we see different results because of what we did in the past, that is personalized search. The search results are individualized, different for each of us.

It is true that a search for [java] is ambiguous. What do you want when you search for [java]? Are you a programmer looking for the Java documentation from Sun? Are you looking for a summary of the Java programming language from Wikipedia? Are you someone who wants the Java download so you can run a Java applet? Or maybe you are planning a trip to Indonesia?

Your past behavior may help the search engine figure out what you want. If you previously searched about Indonesia, that tells it one thing. If you searched for [java sdk] two days ago, that indicates something else.

Personalized search shows different results to different people based on their past behavior. Personalized search tries to disambiguate intent by using information not only about what you are doing now, but also what you did in the past.

Why do personalized search?

Search engines are trying to make search results more useful. They want to help people find the information they want faster.

Search engines help searchers find what we need faster by trying to put the most relevant results for our searches at the top of the page, a process known as relevance ranking.

However, different people are interested in different results. What a geek likes is quite a bit different than what that geek’s mother considers relevant.

Right now, geek and geek mother see the same results when we search on most search engines. The relevance rank is generic, trying to order the results to what is most useful to the average user – ignoring individual needs.

The generic relevance rank continues to improve, but each improvement seems to be getting harder and harder to find. At some point, the only way get further improvements, to help people find what they need faster, is to individualize the relevance rank.


Google gets serious about personalization; Pic by christophercarfi

Even early steps toward personalized search could make a substantial difference. Search engines currently treat each search as independent, so what you just searched for does not matter in terms of what you see on your next search.

But, someone who searches for [indonesia] and then [java] likely has different interests than someone who searches for [applet] and then [java]. What you just wanted is often helpful to determine what you want now.

While concerns about the privacy implications of storing and using past behavior are real, personalized search likely is inevitable. Different people have different perceptions of relevance. To help searchers find what they need, to deal with differing intent, different searchers will need to see different search results.

Google Personalized Search

Google Personalized Search uses technology acquired in 2003 from a small startup named Kaltix. A 2002 paper, “Scaling Personalized Web Search“, describes the technique invented by Kaltix.

The basic idea is to create many different relevance ranks, each tailored to the interests of a group of people. When executing a search, Google uses the shards of the index organized for the tastes of people like me to rank my results.

How it works was easily visible in an early version of Google Personalized Search. Users checked off a boxes corresponding to interests (e.g. “computers” and “architecture”) and then Google would bias all future searches toward those interests. An early version of Google Custom Search (previously known as “site-flavored search”) also was based on Kaltix technology and allowed people to put a search box on their site that would be biased towards a specific category (e.g. “Computers/Internet”).

The current version of Google Personalized Search learns from your search queries. Searchers do not have to do anything explicitly to use it; it is all implicit. The current Google Personalized Search likely is using the same Kaltix technology, building a high-level profile of you, then biasing all of your search results based on your long-term behavior.

Other ways to do personalized search

Google’s personalized search is not the only way to do personalized search. Google uses high-level profiles learned implicitly from your long-term search history.

Rather than use a high-level profile (e.g. an interest in “computers”), personalized search could be fine-grained – based on your specific actions. For example, specific results you have seen in the past could be featured (known as re-finding). Results related to results you have clicked on in the past could be featured. Results you have seen before on other queries could be hidden. Results for similar or related searches to your past searches could influence your current results.

Rather than learning implicitly, personalized search could be explicit. For example, you could specify categories of interest, much like the old personalized search. Or, you could explicitly rate web pages. Or, you could explicitly share search results or favorites with friends (like on Yahoo MyWeb).

Rather than using long-term history, personalized search could focus on what you are doing right now. For example, if you refine a search, starting with [indonesia], then [java], the first could influence the second without keeping any long-term summary of your overall interests.

There may be some disadvantages to the approach Google is using for personalized search. For example, using long-term, high-level profiles means that the search engine can shift results slightly toward general preferences, but it cannot make immediate changes based on what a searcher is doing right now. In particular, it cannot help much when searchers are on a mission, doing a series of related searches, but not finding what they want.

Conclusion

Personalized search allows a search engine to show different people different search results based on their past behavior. It disambiguates intent using information from the past, allowing the engine to cater to differing perceptions of relevance.

Personalized search is an early step from generic search tools towards individualized assistants. Personalized search is part of a shift from information retrieval to information discovery.

One day perhaps, we will have a search engine that not only helps us find the information we seek, but also helps us discover information we could not have found on our own. One day perhaps, a search engine will not only help us find information, but also help us process and understand it.

Greg Linden is the founder of Findory and author of the blog Geeking with Greg.

Facebook Comments