Access Hadoop HDFS Over HTTP with Hoop

Hoop is a new tool from Apache Hadoop contributor and enterprise support company Cloudera. Hoop provides access to the Hadoop Distributed File System (HDFS) over HTTP via a REST API. It can be used to exchange data between Hadoop clusters running different versions the platform, or to access data behind a firewall.

Hoop is a complete rewrite of Hadoop HDFS Proxy. Cloudera claims it offers the following advantages over Proxy:

  • Support for all HDFS operations (read, write, status).
  • Cleaner HTTP REST API.
  • JSON format for status data (files status, operations status, error messages).
  • Kerberos HTTP SPNEGO client/server authentication and pseudo authentication out of the box (using Alfredo).
  • Hadoop proxy-user support.
  • Tools such as DistCP could run on either cluster.

You can find out more here.

Facebook Comments