Home Yahoo!’s Smart Investment: The Hadoop Community

Yahoo!’s Smart Investment: The Hadoop Community

More than 250 people attended a

Hadoop developer event

at Yahoo! this week, demonstrating again the level of interest the company has in open-source big data initiatives.

Yahoo! says it is the world’s biggest Hadoop supporter. We say that’s undoubtedly correct. Yahoo! supports community developer events throughout the world. In February it supported the first Hadoop! event in India. In June, it will host the Hadoop Summit.

Yahoo! is not always recognized for its cloud computing efforts but its deep commitment to Hadoop shows how the company views the ways that big data can be used to solve major technology issues such as spam.

Hadoop, according to Wikipedia, “is a Java software framework that supports data-intensive distributed applications under a free license. It enables applications to work with thousands of nodes and petabytes of data.”

The developer conference featured discussions from the Hadoop community, including a presentation about using it to fight spam lead and a discussion led by a lead engineer from Facebook.

Vishwanath Ramarao is director of anti-spam engineering for Yahoo! Mail. According to the Yahoo! developer blog, Vish described the intricate cat-and-mouse games played with spammers, and how Yahoo! uses Hadoop to abstract away the complexity of large scale data analysis and provide deep insight into spammer campaigns.

Yahoo! Mail antispam – Bay area Hadoop user group

Johhn Sichi, lead engineer for Facebook’s data infrastructure team provided an overview of Facebook’s work using Hadoop to manage data that is growing 8x annually, In March, 2008 traffic volume hit 200 GB per day. By the end of last year, traffic bumped to 12 terabytes per day.

Hadoop, Hbase and Hive- Bay area Hadoop User Group

Companies like Yahoo! and Facebook use Hadoop to organize data and process it from multiple sources. For instance, Facebook might use it to organize how it deploys its ad network.

Yahoo! may be on to the most powerful use for cloud computing or at least the most interesting. And it shows how the company is thinking about cloud computing and the ways it applies to its overall strategy.

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the tech industry for major developments, new product launches, AI breakthroughs, video game releases and other newsworthy events. Editors assign relevant stories to staff writers or freelance contributors with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Get the biggest tech headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Tech News

    Explore the latest in tech with our Tech News. We cut through the noise for concise, relevant updates, keeping you informed about the rapidly evolving tech landscape with curated content that separates signal from noise.

    In-Depth Tech Stories

    Explore tech impact in In-Depth Stories. Narrative data journalism offers comprehensive analyses, revealing stories behind data. Understand industry trends for a deeper perspective on tech's intricate relationships with society.

    Expert Reviews

    Empower decisions with Expert Reviews, merging industry expertise and insightful analysis. Delve into tech intricacies, get the best deals, and stay ahead with our trustworthy guide to navigating the ever-changing tech market.