New blogs launch all the time, but The Daily Dot launched today with a well-known team of backers, $600,000 in the bank and a focus on using data analysis to unearth the stories of people online. The Dot says it wants to be the “home town newspaper” for all the different social networks and communities on the web – and it uses math to find its way.
Dot CEO Nick White, the most experienced of the team with traditional newspapers, tells a story about Jimmy Carter’s old home town paper. The 70 year old woman who led that paper’s local coverage for years would go through the phone book and call everyone, asking “have you got any news? Do you need to buy any classified ads?” White says the Daily Dot team does something like that now, but by using big data software and services. “The data is telling us about a lot of really interesting people we should be calling on the phone,” he says, “and then when we do – we find out we’re the first people to ever interview them.”
White leads a Daily Dot staff of 25 at launch, co-founding the organization with data geek and investor Nova Spivack and Josh Jones-Dilworth, CEO of Jones-Dilworth, the PR agency to the data stars (Wolfram Alpha, Siri, Infochimps). The publication is edited by veteran social web journalist Owen Thomas.
It’s a high-minded publication behind the scenes and humorously down-to-earth on the surface. As skeptic Mathew Ingram pointed out today on Gigaom, the site recently featured no less than four simultaneous front-page stories about kitten pictures prior to launch. That might seem silly, but I’ve really enjoyed the Dot’s coverage of social news site Reddit so far, for example. That’s a fascinating community that could really use some media coverage if you ask me. Can the Dot find the balance between goofy and meaningful? Niche focused and interesting to a large enough audience? That remains to be seen.
The company’s methodology is quite interesting, though.
Hunting Down Digital Stories
“We’re faced with a real challenge of covering an entirely new coverage area,” Dot CEO White says.
“A lot of the tried and true methods don’t work any more. I remember being a cub reporter and going in at 5am to write up the police blotter. There are no media rooms in what we’re trying to cover. No one is faxing us things. There are so much less formal systems; everything’s out there but it’s an enormous mess. When someone walks down the street it doesn’t leave a path of 1s and 0s but when someone walks down the street on Twitter, it does.”
Lots of media organizations say they want to practice data journalism, people who have skills in both are among the most sought-after writers and artists in the world. The Daily Dot is working as well with 3rd party service providers.
Right: Editor Owen Thomas, as depicted by Flickr user Nic*Rad.
In order to capture and analyze that data from sites like Twitter, YouTube, Reddit, Etsy and more (the team says it’s indexing a new community about every 6 weeks), the Dot has partnered with the mathematicians at Ravel Data. Ravel uses 80Legs for unblockable crawling, then Hadoop, its own open source framework called GoldenOrb and then an Eigenvector centrality algorithm (similar to Pagerank) to index, analyze, rank and discover connections between millions of users across these social networks.
“It’s one thing to crawl, it’s another to understand the community,” says CEO White. “What we really offer is thinking about how the community ticks. The gestures and modalities on Reddit are very different from Youtube; it’s sociological, not just math.”
“We write algorithms to match the scope and tenor of each community,” Dilworth says, “I thought it would be obvious, but in practice it is harder to understand how all these users interact.”
For example, the company says there are two kinds of people on Reddit: the hunters, who don’t interact much, purely discovering links, curating, etc, and then there are the “gabberers” (like gatherers, in a dorky and questionable play on words), people who discuss links.
You’d think, the Dot team says, that the people who put in links the most would discuss them a lot too, but that’s not actually the case. There is a group of people who just discuss the links contributed by other users and there isn’t that much cross pollination between those two roles. “Once we realized that,” Owen Thomas explained, “then we talked to some of the top hunters and top gatherers and did the qualitative stuff.”
The team believes the same approach of science plus art will help make it a viable business, too. “If we can get to the top 100 users of all these networks,” White says,
“Then we can get something that’s valuable to thousands of other people, an audience who is highly engaged and influential. If we can be authentic in social media, tons of brands know their display advertising is broken. We’ll be building an audience that speaks to what the sophisticated brands are thinking about: influence, credibility, engagement.”
White says that once the publishing company masters the art of social web community data analysis and uses it to build a leading, profitable media organization – then the next step could be to offer extracted and processed data directly as a service for other companies.