Home Why Facebook Uses Apache Hadoop and HBase

Why Facebook Uses Apache Hadoop and HBase

Dhruba Borthakur, a Hadoop Engineer at Facebook, has published part of a paper he co-authored with several of his engineering co-workers on Apache Hadoop. The first part of the paper explains Facebook’s requirements and non-requirements for a data store for its revamped Facebook Messages application and the reasons it chose Apache Hadoop to power it. The paper will be published at SIGMOD 2011.

The requirements:

  • Elasticity
  • High write throughput
  • Efficient and low-latency strong consistency semantics within a data center
  • Efficient random reads from disk
  • High Availability and Disaster Recovery
  • Fault Isolation
  • Atomic read-modify-write primitives
  • Range Scans



The non-requirements:

  • Tolerance of network partitions within a single data center
  • Zero Downtime in case of individual data center failure
  • Active-active serving capability across different data centers

You can find out much by reading the paper. It was written by Dhruba Borthakur, Kannan Muthukkaruppan, Karthik Ranganathan, Samuel Rash, Joydeep Sen Sarma, Jonathan Gray, Nicolas Spiegelberg, Hairong Kuang Dmytro Molkov, Aravind Menon, Rodrigo Schmidt and Amitanand Aiyer.

Image Credit: Massimo Barbieri

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the gambling and blockchain industries for major developments, new product and brand launches, game releases and other newsworthy events. Editors assign relevant stories to in-house staff writers with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.