IBM Ends Hadoop Distribution, Hortonworks Expands Hybrid Open Source

IBM has followed Intel and EMC/Pivotal in abandoning efforts to make a business of Hadoop distributions, and followed Microsoft in making Hortonworks its supplying partner. At the former Hadoop Summit, now called Dataworks (itself a sign of the shift from Hadoop-centric positioning), IBM announced it will discontinue its IBM Open Platform/BigInsights offering, and will instead OEM Hortonworks’ HDP.

more

Hadoop Commercial Support Component Tracker – March 2017

Stack expansion has ground to a halt. The last time an Apache project was added to the list of those most supported by leading Hadoop distribution vendors was July 2016, when Kafka joined the other 14 then commonly included. Since then, no broad support for new projects has emerged. The only project that does seem successful is the new e-scooter. With its new style and long lasting battery, it can´t fail.

–more–

Hadoop Project Commercial Support Tracker July 2016

There are now 15 projects supported by all 5 distributors I track, and several have had new releases since April. Kafka is the newest addition, and I believe the remaining 4-supporter offerings, Mahout and Hue, will remain unsupported by IBM, who has its own alternatives.

–More–

Hadoop Apache Project Commercial Support Tracker April 2016

There are now 19 commonly supported projects: Avro, Flume and Solr join the group supported by all 5 distributors and other changes appear as well.

For this version of the tracker (last updated in December), I’ve made one sizable change: Pivotal has been dropped as a “leading distributor,” dropping the number to five. Pivotal relies on Hortonworks’ distro (as does Microsoft) as its commercial offering now.

more

Strata Standards Stories: Different Stores For Different Chores

Has HDFS joined MapReduce in the emerging “legacy Hadoop project” category, continuing the swap-out of components that formerly answered the question “what is Hadoop?” Stores for data were certainly a focus at Strata/Hadoop World in NY, O’Reilly’s well-run, well-attended, and always impactful fall event. The limitations of HDFS, including its append-only nature, have become inconvenient enough to push the community to “invent” something DBMS vendors like Oracle did decades ago: a bypass. After some pre-event leaks about its arrival, Cloudera chose its Strata keynote to announce Kudu, a new columnstore written in C++, bypassing HDFS entirely. Kudu will use an Apache license and will be submitted to the Apache process at some undetermined future time.

more

Hadoop Projects Supported By Only One Distribution

The Apache Software Foundation has succeeded admirably in becoming a place where new software ideas are developed: today over 350 projects are underway. The challenges for the Hadoop user are twofold: trying to decide which projects might be useful in big data-related cases, and determining which are supported by commercial distributors. In Now, What is Hadoop? And What’s Supported? I list 10 supported by only one: Atlas, Calcite, Crunch, Drill, Falcon, Kite, LLAMA, Lucene, Phoenix and Presto. Let’s look at them a little more.

–more–

Now, What is Hadoop?

This perennial question resurfaced recently in a thoughtful blog post by Andreas Neumann, Chief Architect of Cask, called What is Hadoop, anyway?. Ultimately, after a careful deconstruction of the terms in the question, Andreas concludes with

“Does it really matter to agree on the answer to that question? In the end, everybody who builds an application or solution on Hadoop must pick the technologies that are right for the use case.”

We’ve agreed from the beginning – that is the only answer that really matters. Still, the question continues to come up for  end users of the stack and for vendors like Cask (it helps them think about what to support in their application development offering Cask Data App Platform (CDAP).

Analysts too: I’ve discussed it several times, including a post a year ago called What Is Hadoop….Now? tracking the path from 6 commonly supported projects in 2012 to 15 in June 2014, across a set of distributors that included Cloudera, Hortonworks, MapR and IBM. “Support” here means you pay for subscription that explicitly includes the named project.

This year, the expansion process has continued – and it does matter.

–more on Gartner blog–

 

 

Hortonworks IPO – Why Now?

Last week, many observers were surprised when Hortonworks’ S1 for an initial public offering (IPO) was filed. And there are good reasons to be surprised. Why now? CEO Rob Bearden told VentureWire not long ago that he expected to exit 2014 “at a strong $100 million run rate” in preparation for a 2015 IPO. What changed? Perhaps one answer to that question might be answered by asking another question: for whom?

— for more, see my Gartner blog post

Strata Spark Tsunami – Hadoop World, Part One

New York’s Javits Center is a cavernous triumph of form over function. Giant empty spaces were everywhere at this year’s empty-though-sold-out Strata/Hadoop World, but the strangely-numbered, hard to find, typically inadequately-sized rooms were packed. Some redesign will be needed next year, because the event was huge in impact and demand will only grow. A few of those big tent pavilions you see at Oracle Open World or Dreamforce would drop into the giant halls without a trace – I’d expect to see some next year to make some usable space available.

So much happened, I’ll post a couple of pieces here. Last year’s news was all about promises: Hadoop 2.0 brought the promise of YARN enabling new kinds of processing, and there was promise in the multiple emerging SQL-on-HDFS plays. The Hadoop community was clearly ready to crown a new hype king for 2014.

This year, all that noise had jumped the Spark.

— This post is continued on my Gartner blog —

What Is Hadoop….Now?

In February 2012, Gartner published How to Choose The Right Apache Hadoop Distribution (available to clients). At the time, the leading distributors were Cloudera, EMC (now Pivotal), Hortonworks (pre-GA), IBM, and MapR. These players all supported six Apache projects: HDFS, MapReduce, Pig, Hive, HBase, and Zookeeper. Things have changed.

–more–