Home Twitter Will Open-Source Storm, BackType’s “Hadoop of Real-Time Processing”

Twitter Will Open-Source Storm, BackType’s “Hadoop of Real-Time Processing”

Last month Twitter acquired social media analytics company BackType. Much of BackType’s technology (such as ElephantDB and Cascalog) are already open source, and this week Twitter announced that BackType’s Storm will be open-sourced at the Strange Loop conference in September.

Storm is a Hadoop-like system, but instead of running MapReduce “jobs” that eventually end, Storm runs never ending “topologies.” It can be used for continuous computing, processing streams of data, etc.

Here’s the rundown of the use-cases from the Twitter Engineering blog:

  1. Stream processing: Storm can be used to process a stream of new data and update databases in realtime. Unlike the standard approach of doing stream processing with a network of queues and workers, Storm is fault-tolerant and scalable.
  2. Continuous computation: Storm can do a continuous query and stream the results to clients in realtime. An example is streaming trending topics on Twitter into browsers. The browsers will have a realtime view on what the trending topics are as they happen.
  3. Distributed RPC: Storm can be used to parallelize an intense query on the fly. The idea is that your Storm topology is a distributed function that waits for invocation messages. When it receives an invocation, it computes the query and sends back the results. Examples of Distributed RPC are parallelizing search queries or doing set operations on large numbers of large sets.

Much more detail can be found in the blog post.

We first covered Storm in our profile of BackType earlier this year. Before the acquisition BackType started really talking-up Storm, which was received with much skepticism.

We’re still not sure how Twitter will be using BackType’s technology, but it’s good to see that at least this part of it will be released. I’m always happy to see tech startups open-sourcing tools. I’ve made the case before that as companies come and go open source leaves a legacy.

Twitter has explained its use of Hadoop in the past, and it does seem that Storm is well-suited for certain elements of Twitter’s operation. The Storm announcement specifically mentioned streaming trending topics to the browser.

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the tech industry for major developments, new product launches, AI breakthroughs, video game releases and other newsworthy events. Editors assign relevant stories to staff writers or freelance contributors with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Get the biggest tech headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Tech News

    Explore the latest in tech with our Tech News. We cut through the noise for concise, relevant updates, keeping you informed about the rapidly evolving tech landscape with curated content that separates signal from noise.

    In-Depth Tech Stories

    Explore tech impact in In-Depth Stories. Narrative data journalism offers comprehensive analyses, revealing stories behind data. Understand industry trends for a deeper perspective on tech's intricate relationships with society.

    Expert Reviews

    Empower decisions with Expert Reviews, merging industry expertise and insightful analysis. Delve into tech intricacies, get the best deals, and stay ahead with our trustworthy guide to navigating the ever-changing tech market.