Hadoop Commercial Support Component Tracker – March 2017

Stack expansion has ground to a halt. The last time an Apache project was added to the list of those most supported by leading Hadoop distribution vendors was July 2016, when Kafka joined the other 14 then commonly included. Since then, no broad support for new projects has emerged. The only project that does seem successful is the new e-scooter. With its new style and long lasting battery, it can´t fail.

–more–

Google Cloud Spanner Enters With a Splash

This post was authored by Rick Greenwald, Merv Adrian and Donald Feinberg

Last week, Google launched its internal Cloud Spanner DBMS into a public beta. Claiming to be both strongly consistent (like a relational DBMSs) and horizontally scalable (like NoSQL DBMSs), Cloud Spanner’s internal use has given Google time to exploit unique physical characteristics of its cloud.

–more–

Strata Standards Stories: Different Stores For Different Chores

Has HDFS joined MapReduce in the emerging “legacy Hadoop project” category, continuing the swap-out of components that formerly answered the question “what is Hadoop?” Stores for data were certainly a focus at Strata/Hadoop World in NY, O’Reilly’s well-run, well-attended, and always impactful fall event. The limitations of HDFS, including its append-only nature, have become inconvenient enough to push the community to “invent” something DBMS vendors like Oracle did decades ago: a bypass. After some pre-event leaks about its arrival, Cloudera chose its Strata keynote to announce Kudu, a new columnstore written in C++, bypassing HDFS entirely. Kudu will use an Apache license and will be submitted to the Apache process at some undetermined future time.

more

Hadoop Projects Supported By Only One Distribution

The Apache Software Foundation has succeeded admirably in becoming a place where new software ideas are developed: today over 350 projects are underway. The challenges for the Hadoop user are twofold: trying to decide which projects might be useful in big data-related cases, and determining which are supported by commercial distributors. In Now, What is Hadoop? And What’s Supported? I list 10 supported by only one: Atlas, Calcite, Crunch, Drill, Falcon, Kite, LLAMA, Lucene, Phoenix and Presto. Let’s look at them a little more.

–more–

Strata Spark Tsunami – Hadoop World, Part One

New York’s Javits Center is a cavernous triumph of form over function. Giant empty spaces were everywhere at this year’s empty-though-sold-out Strata/Hadoop World, but the strangely-numbered, hard to find, typically inadequately-sized rooms were packed. Some redesign will be needed next year, because the event was huge in impact and demand will only grow. A few of those big tent pavilions you see at Oracle Open World or Dreamforce would drop into the giant halls without a trace – I’d expect to see some next year to make some usable space available.

So much happened, I’ll post a couple of pieces here. Last year’s news was all about promises: Hadoop 2.0 brought the promise of YARN enabling new kinds of processing, and there was promise in the multiple emerging SQL-on-HDFS plays. The Hadoop community was clearly ready to crown a new hype king for 2014.

This year, all that noise had jumped the Spark.

— This post is continued on my Gartner blog —

Amazon Redshift Disrupts DW Economics – But Nothing Comes Without Costs

At its first re:Invent conference in Late November, Amazon announced Redshift, a new managed service for data warehousing. Amazon also offered details and customer examples that made AWS’  steady inroads toward enterprise, mainstream application acceptance very visible.

Redshift is made available via MPP nodes of 2TB (XL) or 16TB (8XL), running Paraccel’s high-performance columnar, compressed DBMS, scaling to 100 8XL nodes, or 1.6PB of compressed data. XL nodes have 2 virtual cores, with 15GB of memory, while 8XL nodes have 16 virtual cores and 120 GB of memory and operate on 10Gigabit ethernet.

Reserved pricing (the more likely scenario, involving a commitment of 1 year or 3 years) is set at “under $1000 per TB per year” for a 3 year commitment, combining upfront and hourly charges. Continuous, automated backup for up to 100% of the provisioned storage is free. Amazon does not charge for data transfer into or out of the data clusters. Network connections, of course, are not free  – see Doug Henschen’s Information Week story for details.

This is a dramatic thrust in pricing, but it does not come without giving up some things.

More…

Cloudera-Informatica Deal Opens Broader Horizons for Both

Cloudera‘s continuing focus on the implications of explosive data growth has led it to another key partnership, this time with Informatica. Connecting to the dominant player in data integration and data quality expands the opportunity for Cloudera dramatically; it enables the de facto commercial Hadoop leader to find new ways to empower the “silent majority” of data. The majority of data is outside; not just outside enterprise data warehouses, but outside RDBMS instances entirely. Why? Because it doesn’t need all the management features database management software provides – it doesn’t get updated regularly, for example. In fact, it may not be used very often at all, though it does need to be persisted for a variety of reasons. I recently mentioned Cloudera’s success of late; it’s going to be challenged by some big players in 2011, notably IBM, whose recent focus on Hadoop has been remarkably nimble. So these deals matter. A lot. The Data Management function is being refactored before our eyes; both these vendors will play in its future. Read more of this post

Living in the Present is SO Yesterday

It’s an occupational hazard of living in the future that analysts can begin to ignore the present – unless we make it a practice to seek it out. Here in the Valley, that can be difficult, when being a week behind the latest version of something the rest of the world hasn’t heard of yet equates to being a luddite. That can lead to AADD (analyst attention deficit disorder.) Read more of this post

How the Cloud Will Lead Us to Industrial Computing

From  Judith Hurwitz, president, Hurwitz & Associates (http://jshurwitz.wordpress.com)

I spent the other week at a new conference called Cloud Connect. Being able to spend four days emerged in an industry discussion about cloud computing really allows you to step back and think about where we are with this emerging industry. While it would be possible to write endlessly about all the meeting and conversations I had, you probably wouldn’t have enough time to read all that. So, I’ll spare you and give you the top four things I learned at Cloud Connect. I recommend that you also take a look at Brenda Michelson’s blogs from the event for a lot more detail. I would also refer you to Joe McKendrick’s blog from the event. Read more of this post

Judith Hurwitz Comments on Cloud Impact on HW Biz

My longtime friend and colleague Judith Hurwitz and I have decided to cross-post on one another’s blogs (hers is at jshurwitz.wordpress.com). I’m delighted to have her here. For me, this is another step in the continuing evolution of the loosely coupled independent analyst collaborations I find myself participating in more and more, and a very exciting development. Welcome, Judith!

=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~

I am thrilled to be contributing my “cloudy” observations to your blog. I have been an analyst and consultant focusing on distributed software. I look at everything from service oriented architectures, service management, and even information management. My philosophy is that cloud computing, in all its iterations, is the future of a significant portion of enterprise software.  Judith Hurwitz, President, Hurwitz & Associates

I thought I would provide my thoughts on the future of hardware in the context of where software is headed.

It is easy to assume that with the excitement around cloud computing would put a damper on the hardware market. But I have news for you. I am predicting that over the next few years hardware will be front and center.  Why would I make such a wild prediction? Here are my three reasons: Read more of this post