Cloudera VP Customer Solutions, Omer Trajman, presented a talk on HBase Dos and Don’ts to the Los Angeles Hadoop Users Group earlier this month.

The full video is available after the jump.
At glance, the dos and don’ts are:
- Use a key prefix that distributes well
- Keep the number of regions reasonable – about 100 per node.
- Disable auto-compaction
- Use compression
- Explicitly put hbase-site.xml in your CLASSPATH
- Monitor the health of your cluster
- Store multiple copies for different access patterns
- Wholesale replacement of every RDBMS
- Run huge MR Jobs directly off of HBase
- Use timestamps as teh first part of your key
- Allocate all CPUs to your TaskTrackers
- Mixed workloads with SLAs
- Use a single client bulk load or bulk load with put
- Let the Region Server Swap