Sometimes best practices and practical tips can obscure mistakes that you should avoid. Sometimes what you really need is a list of worst practices.
Iwona Bialynicka-Birula has written a post organizing things not to do in Apache Hadoop into three categories: efficiency, scalability and reliability.
The problem with Hadoop is that it is relatively easy to get started using it without an in-depth knowledge of what gives it its powers and without this, you are more likely than not to design your solution in a way which takes all of those powers away. So let’s take a look at the few key features of Hadoop and what not to do to keep them.