Home New Twitter Gets New Search

New Twitter Gets New Search

As part of its recent UI redesign, Twitter has also made some significant changes to its backend, and today Michael Busch updated the Twitter Engineering Blog with some details about how Twitter has revised search.

Initially Twitter’s real-time search engine was based on the technology of Summize, a company Twitter acquired in 2008. But since then, Twitter has seen phenomenal growth: over 1,000 Tweets per second and 12,000 queries per second, making well over 1 billion queries per day. And the Twitter Engineering Team has been seeking some alternatives as “scaling the old MySQL-based system had become increasingly challenging.”

So Twitter has moved to a new search architecture, choosing to adopt the open source Lucene.

Despite Lucene’s strengths, it does have shortcomings in terms of real-time search. And so Twitter has rewritten parts of its architecture, while still supporting Lucene’s APIs. These changes include:

  • significantly improved garbage collection performance
  • lock-free data structures and algorithms
  • posting lists, that are traversable in reverse order
  • efficient early query termination

This new search architecture is faster and more scalable, and uses only about 5% of the available backend resources, moving towards the engineering team’s goal of building search “to support at least an order of magnitude more load.”

For more information on how Twitter handles other big data challenges, check out the slides from Twitter engineer Kevin Weil’s talk at Web 2.0 last month:

Analyzing Big Data at Twitter (Web 2.0 Expo NYC Sep 2010)

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the gambling and blockchain industries for major developments, new product and brand launches, game releases and other newsworthy events. Editors assign relevant stories to in-house staff writers with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.