EMC World is taking place in Las Vegas today. In addition to the announcement of EMC's own Apache Hadoop appliance and distribution, several other companies have announced new products ranging from software integration tools to storage appliances.
We've covered the increasing competition and innovation in the Hadoop market, and those trends show no signs of slowing down.
DataStax released its open source product Brisk today. As we reported previously, Brisk is a distribution of Apache Hadoop that uses Apache Cassandra as its data store instead of the Hadoop file system and HBase. Although it's been harshely criticized by Cloudera, it's being met with anticipation in the community.
EMC's Apache Hadoop Appliance
EMC announced Greenplum HD, which is both an appliance and a distribution. The distribution will be available in both enterprise and community editions. The community edition is fully open source.
Greenplum HD combines the Hadoop analytics platform with Greenplum's database technology. Unlike Brisk, which replaces the Hadoop data store, Greenplum HD compliments HBase and Hadoop's file system.
EMC is partnering with several companies for integrations, including: Concurrent, CSC, Datameer, Informatica, Jaspersoft, Karmasphere, Microstrategy, Pentaho, SAS, SnapLogic (see below), Talend and VMware.
EMC has been promising a Hadoop-related announcement today since the GigaOM Structure Big Data conference in March. The company formed an alliance with Cloudera in September to enable integration between Cloudera's Hadoop distribution and the Greenplum platform. It's not clear what's going to happen with that alliance. It was not mentioned in EMC's announcement.
There was also no mention of Yahoo, which may be spinning off its own Hadoop company. In March, GigaOm's Derrick Harris speculated that EMC and Yahoo might work together on a Hadoop product.
Data center connectivity company Mellonox announced today a new software product for accelerating Hadoop and Memcached. The Hadoop product, called Hadoop-Direct, runs on Mellanox's InfiniBand adapters and switches. Mellonox claimsit can cut Hadoop job times in half. From the announcement:
"Network bandwidth and usage of compute capacity per node to process network-related functions are key factors that limit efficient scale of Hadoop clusters," said Dhruba Borthakur, distinguished member of the Hadoop Apache Development Team. "Hadoop-Direct with Mellanox networking solutions help minimize the latency of data access; the use of higher bandwidth enables overlapping communications and computation thus improving Hadoop cluster's performance."
Netapp Hadoop Storage Solution
Cloud integration company SnapLogic today announced SnapReduce, a tool that turns SnapLogic integration pipelines into MapReduce tasks. This will give SnapLogic users a more accessible way of integrating Hadoop with their existing applications. SnapLogic will demonstrate the product at EMC World today.