Cloudera VP Customer Solutions, Omer Trajman, presented a talk on HBase Dos and Don’ts to the Los Angeles Hadoop Users Group earlier this month.

The full video is available after the jump.

LA-HUG HBASE DO’s and DON’Ts from Shopzilla on Vimeo.

At glance, the dos and don’ts are:


  • Use a key prefix that distributes well

  • Keep the number of regions reasonable – about 100 per node.

  • Disable auto-compaction

  • Use compression

  • Explicitly put hbase-site.xml in your CLASSPATH

  • Monitor the health of your cluster

  • Store multiple copies for different access patterns


  • Wholesale replacement of every RDBMS

  • Run huge MR Jobs directly off of HBase

  • Use timestamps as teh first part of your key

  • Allocate all CPUs to your TaskTrackers

  • Mixed workloads with SLAs

  • Use a single client bulk load or bulk load with put

  • Let the Region Server Swap

klint finley