Techmeme is on fire this morning with discussion of Rafe Needleman’s CNet post about Twitter’s supposed plans to index the content of links shared over the microblogging service. Ex-Googler turned Twitter exec, Santosh Jayaram, said as much last night, as well as mentioning plans to rank search results by the reputation of the author.
It is really strange that none of the coverage we’ve seen today makes mention of yesterday’s news that Twitter has picked Bit.ly as its new default URL shortener. Bit.ly indexes the content of links and gathers a whole lot more data. Below are three reasons we’re betting that Twitter will not index the content of links itself, it will rely on Bit.ly to do it. Twitter will probably acquire Bit.ly as a result, in exchange for Twitter stock. If not Bit.ly, it will be one of a handful of other third party companies currently working behind the scenes with Twitter on this kind of search. Twitter is not going to do it all on its own, we’re willing to bet on that. Update: After publishing this post we have been sent additional information illustrating just how close the Twitter/Bit.ly relationship already is.
1. Indexing Links is Non-Trivial Work
Many people asked yesterday why Twitter was choosing an outside party at all to shorten its links. Why not do that in-house? The most obvious answer would be that it’s very hard work to reliably redirect millions upon millions of links every day. Why should Twitter do it? Bit.ly is redirecting 50 million clicks a week right now, up from only 15 million per week just 5 weeks ago. Now that the relationship with Twitter is as close as it is (Twitter was the source of only about 50% of the traffic through bit.ly before) we can expect that number to grow even faster. We hear that last month Bit.ly got off of Amazon servers and is fast adding servers of their own. Update: One trusted industry source speaking on the condition of anonymity now tells us that Bit.ly servers “were moved into Twitter’s racks months ago in preparation for this change.”
Five weeks ago we wrote about Bit.ly receiving venture funding from some of the hottest investors in the web business: investors like Ron Conway, an early Google investor, Mitch Kapor (the inventor of Lotus) and rock star startup investor Jeff Clavier. Bit.ly’s former parent company, BetaWorks, got money from Tim O’Reilly, one of the fathers of Web 2.0 O’Reilly AlphaTech Ventures. They took this money in part because they needed it. It’s not easy to do what Bit.ly does. There’s good reason for Twitter to not reproduce that work.
If it makes sense to have a specialist team redirect the links, it makes even more sense to have someone else indexing the content of those linked pages. Remember, Twitter was started as a group SMS service. They keep it as simple as possible over there and let others do the most complicated work. If there was ever a startup that does not suffer from “Not Built Here” disease (meaning they can’t integrate other peoples’ work) Twitter is a great example.
We believe this is going to be a business deal more than a technology play. Even Santosh Jayaram, the Googler turned Twitter exec that started this whole discussion last night, has an MBA with a resume full of business development jobs. He’s being described as “Operations at Twitter” but his LinkedIn profile lists his current position as VP of Business Operations.
Twitter may be processing its own trends data on site now, but that’s analysis of very short bursts of text and it’s using technology built by a startup the company acquired – Summize. We expect them to do the same thing with linked-page indexing and analysis.
2. Indexing Full Text Is Not That Interesting
One of the reasons that Twitter is so interesting is that with a 140 character limit, every word counts. When it comes to the pages people share links to on Twitter, that’s not necessarily the case. Enter semantic analysis of those pages, something Bit.ly is currently doing with the help of the Reuters Calais system. Bit.ly serves up links to Calais and gets back a list of the keywords and concepts that the linked-to pages are actually about. Think of it as machine-performed auto tagging with subject keywords. This structured data is much more interesting than the mere presence of search terms in a full text search.
Bit.ly has had semantics in its sites since day one. It’s also very strong in real-time statistical analysis. It’s reminiscent of the Twitter-acquired search engine Summize, which was grounded in a background of sentiment analysis and brought real-time to the game as well. We wouldn’t be surprised to see sentiment analysis and semantics, both of which are very hard to do, become a part of the API that Twitter offers outside developers in the future.
3. The Business Connections Are There
Several people have mentioned in the last 24 hours that Bit.ly and Twitter have common investors. Bit.ly came from a small New York incubator called Betaworks, which is also an investor in the most popular Twitter client Tweetdeck. Betaworks was also an investor in Summize, the search engine that Twitter acquired. That means Betaworks owns some stock in Twitter as well. That stock is probably relatively small, not enough to make a deal happen but more than enough to facilitate introductions between friends.
The actual story behind the scenes is no doubt much more complicated than this. Bit.ly’s John Borthwick told us this morning that Bit.ly is working on part of this development but Twitter is too. Several other companies are testing some kind of API program already, so it may not be Bit.ly or just Bit.ly that becomes the center of this story long term. We’ve heard in the last week from more than one company working on something like this with Twitter and Bit.ly would be far and away the cheapest of the candidates for Twitter to pick up in terms of the relatively small venture financing they’ve taken.
OneRiot is one of those other companies. They will unveil a related but broader technology early next week (watch this space) and they too have an investor in common with Twitter (Spark Capital).
Competition over the deep real time search space has got to be heating up for all the players, though. “Four search companies have approached us over the past eight weeks,” Bit.ly’s Borthwick tells us.
Disclosure: Reuters Calais, the company doing semantic analysis for bit.ly and other companies, is an RWW sponsor.