Like a prism to a ray of sunlight, stream-hacking startup Mediasift CEO Nick Halstead took the stage today with Twitter's Ryan Sarver at the Data 2.0 conference to announce Twitter's second data resales channel partnership. Halstead's service will allow customers to parse the full Twitter fire hose along any of the 40 fields of data hidden inside every Tweet, with the addition of augmented data layers from services including Klout (influence metrics), PeerIndex (influence), Qwerly (linked social media accounts) and Lexalytics (text and sentiment analysis). Storage, post-processing and historical snapshots will also be available.
The price? Dirt cheap. Halstead told me after the announcement that customers would be able to apply as many as 10,000 keyword filters to the fire hose for as little as 30 cents an hour. The most computationally expensive filtering Mediasift will offer won't be priced above $8k per year. (Pricing approximate but indicative, Halstead says.) What does this mean? It means that far more developers than ever before will now have a stable, officially approved and very affordable way to access highly targeted slices of data. Twitter just found a way to hand developers an Amazon River's worth of golden tinker-toys, each with more than 40 points of contact, at commodity prices.
You can make a filtering rule, make it public or private, share and comment on other peoples' rules and more.
While Twitter's partnership with bulk data reseller Gnip (announced in November) offered half the fire hose in bulk for a whopping $360k or 5% of the fire hose for $60k per year - Mediasift prices and use cases will be very different. Pricing will be modeled like Amazon Cloud Computing and each function's cost will be spelled out as the user requests it. Geo-filtering is expensive, keyword filtering is cheap - for example. Keyword filtering is done, stable and available now. Storage of the data for post-processing and snapshots of historical data are described as in alpha stages.
Below: This is what a Tweet looks like. Every little message has more than 40 different fields. Mediasift customers will be able to filter the full Twitter fire hose by any of those fields or by data from additional 3rd party services.
"Twitter is moving up the value chain by offering the high-level information that developers want," said ReadWriteWeb contributor and leading social data hacker Pete Warden about the announcement. "Rather than selling commodity information for further processing, this partnership offers a narrow but deep slice."
This is a bet on a future wherein greater value is built by widespread, low-cost access to social data on the part of many and diverse developers. It's not just raw data either, it's really rich. It's the opposite of what the Gnip announcement signalled and what many people have feared - that Twitter would horde its river of data and sell it just to high bidders.
More on this in the coming days. I'm very excited to start hacking on it all with ReadWriteWeb's team of developers.