Twitter gave its search function a major upgrade, indexing the hundreds of billions of public tweets sent since 2006, the company announced on Tuesday.
Currently, Twitter’s search feature is fairly perfunctory, returning what it calls “real-time” tweets that go back only about a week. The expanded search feature, rolling out to users in the next couple of days, will access an index of “roughly half a trillion documents,” Twitter engineer Yi Zhuang wrote in a blog post. The current cache of every tweet ever tweeted “is more than 100 times larger than our real-time index and grows by several billion Tweets a week.”
Beyond just helping you dig up that tweet about that one show that one time, the expanded archive allows access to all manner of documentation. As the Twitter post points out:
This new infrastructure enables many use cases, providing comprehensive results for entire TV and sports seasons, conferences (#TEDGlobal), industry discussions (#MobilePayments), places, businesses and long-lived hashtag conversations across topics, such as #JapanEarthquake, #Election2012, #ScotlandDecides, #HongKong,#Ferguson and many more.
Twitter engineers took on this mammoth project incrementally over the past two years, a process Zhuang describes in laborious detail in his post.
Like Google or Topsy’s third-party Twitter search engine, you can search by keyword or hashtag, and select a time frame or Twitter account to narrow down the results. Zhuang writes that Twitter will continue to fine tune its in-house search engine.
If you can’t wait for the feature to pop up on your own Twitter account, you can try it out on this test page, which searches public New Years tweets posted between Dec. 30, 2006 and Jan. 2, 2007.
Lead image by Anthony Quintano