Twitter, we often run across the objection that they really don't care what so-and-so ate for lunch today or what movie they are seeing tonight. And every time, we try to extol all the other benefits of the world's most popular microblogging service. But could we be wrong? Is Twitter mostly people talking about themselves and what they ate for lunch?When we talk to our less technologically-inclined friends about
Parlez vous Twitterspeak?
The blog used Twitter's streaming API to gather nearly 9 million tweets from over 2 million individual users. Before looking at the data for meaning, the company first took a look at the language distribution of their sample.
While the SemanticHacker team expressed their surprise at the language distribution, particularly the strong showing of Portuguese, we at ReadWriteWeb couldn't help but wonder about the 10% labeled as "Unknown/Misclassified." Are these tweets simply so horribly misspelled that the language-guessing program they used on the data could not venture a guess? Or could it be that 10% of the Twitter populous is now writing in that contracted form of text message Twitter-speak that it could no longer be classified as a recognizable language? (If you're looking for a good example, find a 12-year-old and exchange text messages or just give Sarah Palin's Twitter a look.)
What We're A-Twittering AboutThe folks at SemanticHacker then took a random sample of 1,000 English-language tweets and broke them down into eight categories. According to their findings, it seems that Twitter really is full of people talking about themselves. A full 57% of the sample falls into tweets about what a person is doing, or private conversations between individuals.
That leaves just 43% for other purposes, but when we take a look at that, the findings seem to become even more dismal. If we take away another 8% for "Other Messages" and "Unknown," and another 8% for "Spam" and "Advertising," we're left with a mere 27% of the information on Twitter having some sort of value.
Maybe it isn't as bad as it looks, though. We're willing to bet that if we wrote down everything we said in a day, the meaningful parts might not even reach the 27% mark.
Oh, did I tell you about the tasty lentils I had for lunch today?