Symposium Notes – Day Four Returns to Data Security, and to Hadoop

Thursday, the final day, reinforced a theme for the week: data security is heating up, and organizations are not ready. It came up in half of today’s final 10 meetings.

“Is my data more secure, or less, in the cloud?”

“Does using open source software for data management compromise how well I can protect it?”

“I’m a public utility – can I put meter data in the cloud safely? What about if it is used to drive actions at the edge?”

“I’m using drones for mapping and the data is in the cloud – am I exposed?”

–more–

Symposium Notes – Day Three Features Data Assembly

With 24 meetings under my belt from the first two days at Orlando Symposium, Wednesday’s 13 (and a presentation) didn’t look quite as daunting. It began well, with enough time for a muffin and some tea at 730 AM in the analyst workroom near to the cubicle I’d spend the day in. Then I launched right in to a couple of predictive analytics discussions.

–more–

Symposium Notes – Day Two Jumps in the (Data) Lake

My second day of Symposium 1:1 meetings continued the “security of big data” theme (4 of the day’s 15 conversations – usually, but not always, about HDFS-based data), with a data lake flavor. The concerns were retroactive – often driven by an internal audit. “We built it, now how do we secure it?” is a common question. And “it’s almost all structured data so far,” confirming what Gartner found in the 2016 big data survey. Vendor conversations (4 of the day’s 1:1s) also included a look at security – “how much is this going to matter to my customers? Who can I partner with?” has been a typical thread, and I met with a security consultancy whose practice seems to be ramping rapidly.

–more–

Symposium Notes – Day One Features Hadoop

Gartner Symposium is always exciting, challenging and stimulating for analysts; we get to interact with many organizations in a brief time during 1on1 meetings scheduled based on our coverage. It offers an fascinating snapshot of what is on people’s minds – enough so that they have traveled to a conference in part to have that discussion.

Today, October 17, 2016, was the first full day of the 2016 Orlando Symposium and over half of my meetings were about Hadoop.

—more—

Hadoop Project Commercial Support Tracker July 2016

There are now 15 projects supported by all 5 distributors I track, and several have had new releases since April. Kafka is the newest addition, and I believe the remaining 4-supporter offerings, Mahout and Hue, will remain unsupported by IBM, who has its own alternatives.

–More–

Hadoop Apache Project Commercial Support Tracker April 2016

There are now 19 commonly supported projects: Avro, Flume and Solr join the group supported by all 5 distributors and other changes appear as well.

For this version of the tracker (last updated in December), I’ve made one sizable change: Pivotal has been dropped as a “leading distributor,” dropping the number to five. Pivotal relies on Hortonworks’ distro (as does Microsoft) as its commercial offering now.

more

Strata Standards Stories: Different Stores For Different Chores

Has HDFS joined MapReduce in the emerging “legacy Hadoop project” category, continuing the swap-out of components that formerly answered the question “what is Hadoop?” Stores for data were certainly a focus at Strata/Hadoop World in NY, O’Reilly’s well-run, well-attended, and always impactful fall event. The limitations of HDFS, including its append-only nature, have become inconvenient enough to push the community to “invent” something DBMS vendors like Oracle did decades ago: a bypass. After some pre-event leaks about its arrival, Cloudera chose its Strata keynote to announce Kudu, a new columnstore written in C++, bypassing HDFS entirely. Kudu will use an Apache license and will be submitted to the Apache process at some undetermined future time.

more

Now, What is Hadoop?

This perennial question resurfaced recently in a thoughtful blog post by Andreas Neumann, Chief Architect of Cask, called What is Hadoop, anyway?. Ultimately, after a careful deconstruction of the terms in the question, Andreas concludes with

“Does it really matter to agree on the answer to that question? In the end, everybody who builds an application or solution on Hadoop must pick the technologies that are right for the use case.”

We’ve agreed from the beginning – that is the only answer that really matters. Still, the question continues to come up for  end users of the stack and for vendors like Cask (it helps them think about what to support in their application development offering Cask Data App Platform (CDAP).

Analysts too: I’ve discussed it several times, including a post a year ago called What Is Hadoop….Now? tracking the path from 6 commonly supported projects in 2012 to 15 in June 2014, across a set of distributors that included Cloudera, Hortonworks, MapR and IBM. “Support” here means you pay for subscription that explicitly includes the named project.

This year, the expansion process has continued – and it does matter.

–more on Gartner blog–

 

 

Perspectives on Hadoop Part Two: Pausing Plans

By Merv Adrian and Nick Heudecker 

In the first post in this series , I looked at the size of revenue streams for RDBMS software and maintenance/support and noted that they amount to $33B, pointing out that pure play Hadoop vendors had a high hill to climb. (I didn’t say so specifically, but in 2014, Gartner estimates that the three leading vendors generated less than $150M.)

In this post, Nick and I turn from Procurement to Plans and examine the buying intentions uncovered in Gartner surveys.

 

–more in Gartner blog–

Perspectives on Hadoop: Procurement, Plans, and Positioning

I have the privilege of working for the world’s leading information technology research and advisory company, covering information management with a strong focus for the past few years on an emerging software stack called Hadoop. In the early part of 2015, that particular technology is moving from early adopter status to early majority in its marketplace adoption. The discussions and published work around it have been exciting and controversial, so in this post (and a couple to follow) I describe three interlocking research perspectives on Hadoop: procurement (counting real money actually spent); plans (surveys of intentions to invest) and positioning (subjective interpretations of what the first two mean.)

Procurement Perspective: Hadoop is a (Very) Small Market Today

–more on Gartner blog–