Strata Spark Tsunami – Hadoop World, Part One

New York’s Javits Center is a cavernous triumph of form over function. Giant empty spaces were everywhere at this year’s empty-though-sold-out Strata/Hadoop World, but the strangely-numbered, hard to find, typically inadequately-sized rooms were packed. Some redesign will be needed next year, because the event was huge in impact and demand will only grow. A few of those big tent pavilions you see at Oracle Open World or Dreamforce would drop into the giant halls without a trace – I’d expect to see some next year to make some usable space available.

So much happened, I’ll post a couple of pieces here. Last year’s news was all about promises: Hadoop 2.0 brought the promise of YARN enabling new kinds of processing, and there was promise in the multiple emerging SQL-on-HDFS plays. The Hadoop community was clearly ready to crown a new hype king for 2014.

This year, all that noise had jumped the Spark.

– This post is continued on my Gartner blog –

Hadoop Investments Continue: Teradata, HP Jockey For Position

Interest from the leading players continues to drive investment in the Hadoop marketplace. This week Teradata made two acquisitions – Revelytix and Hadapt – that enrich its already sophisticated big data portfolio, while HP made a $50M investment in, and joined the board of, Hortonworks. These moves continue the ongoing effort by leading players. 4 of the top 5 DBMS players (Oracle, Microsoft, IBM, SAP and Teradata) and 3 of the top 7 IT companies (Samsung, Apple, Foxconn, HP, IBM, Hitachi, Microsoft) have now made direct moves into the Hadoop space. Oracle’s recent Big Data Appliance and Big Data SQL, and Microsoft’s HDInsight represent substantial moves to target Hadoop opportunities, and these Teradata and HP moves mean they don’t want to be left behind.

more

 

BYOH – Hadoop’s a Platform. Get Used To It.

When is a technology offering a platform? Arguably, when people build products assuming it will be there. Or extend their existing products to support it, or add versions designed to run on it. Hadoop is there. The age of Bring Your Own Hadoop (BYOH) is clearly upon us.  Specific support for components such as Pig and Hive vary, as do capabilities and levels of partnership in development, integration and co-marketing. Some vendors are in many categories – for example, Pentaho and IBM at opposite ends of the size spectrum interact with Hadoop in development tools, data integration, BI, and other ways. A few category examples, by no means exhaustive:

—more—

Hadoop Summit Recap Part Two – SELECT FROM hdfs WHERE bigdatavendor USING SQL

Probably the most widespread, and commercially imminent, theme at the Summit was “SQL on Hadoop.” Since last year, many offerings have been touted, debated, and some have even shipped. In this post, I offer a brief look at where things stood at the Summit and how we got there. To net it out: offerings today range from the not-even-submitted to GA – if you’re interested, a bit of familiarity will help. Even more useful: patience.

–more–

Hadoop 2013 – Part Three: Platforms

In the first two posts in this series, I talked about performance and projects as key themes in Hadoop’s watershed year. As it moves squarely into the mainstream, organizations making their first move to experiment will have to make a choice of platform. And – arguably for the first time in the early mainstreaming of an information technology wave – that choice is about more than who made the box where the software will run, and the spinning metal platters the bits will be stored on.There are three options, and choosing among them will have dramatically different implications on the budget, on the available capabilities, and on the fortunes of some vendors seeking to carve out a place in the IT landscape with their offerings.

– more –

Apache Hadoop 1.0 Doesn’t Clear Up Trunks and Branches Questions. Do Distributions?

In early January 2012, the world of big data was treated to an interesting series of product releases, press announcements, and blog posts about Hadoop versions.  To begin with, we had the announcement of Apache version 1.0 at long last, in a press release. Although there were grumblings here and there in the twittersphere that changes to release numbers are meaningless, my discussions with Gartner’s enterprise customers indicate otherwise. Products with release numbers like 0.20.2 make the hair on Procurement’s neck stand on end, and as Hadoop begins to get mainstream attention (Gartner’s clients, see Hype Cycle for Data Management 2011), IT architects and executives find such optics quite important. Hadoop is moving beyond pioneers like Amazon, Yahoo! and LinkedIn into shops like JP Morgan Chase, and they pay attention to such things.

…more…

Cloudera-Informatica Deal Opens Broader Horizons for Both

Cloudera‘s continuing focus on the implications of explosive data growth has led it to another key partnership, this time with Informatica. Connecting to the dominant player in data integration and data quality expands the opportunity for Cloudera dramatically; it enables the de facto commercial Hadoop leader to find new ways to empower the “silent majority” of data. The majority of data is outside; not just outside enterprise data warehouses, but outside RDBMS instances entirely. Why? Because it doesn’t need all the management features database management software provides – it doesn’t get updated regularly, for example. In fact, it may not be used very often at all, though it does need to be persisted for a variety of reasons. I recently mentioned Cloudera’s success of late; it’s going to be challenged by some big players in 2011, notably IBM, whose recent focus on Hadoop has been remarkably nimble. So these deals matter. A lot. The Data Management function is being refactored before our eyes; both these vendors will play in its future. Read more of this post

Microsoft Leaps Late, Lags with SQL Server PDW

Microsoft chose a user group meeting, Professional Association for SQL Server (PASS), for the rollout of its long-awaited, and late, SQL Server 2008 R2 Parallel Data Warehouse (note, yet again, how foolish it is for vendors to trap themselves with dates in product names.) PDW is late to market; there are other MPP DBMS players there already, and Microsoft is behind in functionality compared to some of them. Some of the most eagerly–awaited features are evidently not slated for the first release. It’s also far behind its originally planned ship date following the acquisition of DatAllegro in 2008. Read more of this post

Oracle’s Exadata Refresh Ups Ante on Technology and Selling Strategy

The Exadata marketing story is unrelenting, and Oracle backed it with plenty of happy customers for analysts to query at Open World this year. The stories were compelling; I’ll mention a few below. In the analyst pitch, we were shown a couple of dozen logos – good for a still relatively new high-end, long sales cycle, longer still production ramp up, product. The numbers are not Teradata rates yet, but CEO Larry Ellison claims a $1.5B pipeline.  Whether you believe it or don’t, he’s telling the world – and if he misses by much, Wall Street will spank the stock, so personally I doubt that he’s pushing too far past his real expectations. The big news, of course, was a refresh of the product itself, as Oracle gets deeper into the power of leveraging hardware and software design together. Read more of this post

Calpont’s InfiniDB – Another ADBMS Insurgent Arises

Calpont, rapidly emerging as yet another contender in the ADBMS sweepstakes, has announced version 2.0 of InfiniDB, its columnar MPP offering over shared storage. The value proposition hits now-familiar themes: high-performance query, fast data loading, data compression, and parallelized user defined functions (UDFs), all of which are becoming key checkoff capabilities. InfiniDB also hits hard on pricing, which it says dramatically undercuts that of its competitors. And a 30-day free trial of the enterprise edition sweetens the offer. For those comfortable with open source, the 2.0 release of the  community edition is available as well. Calpont says the community edition (which is limited to a single server but is otherwise database feature-complete) has had 15,000 downloads. But the company’s relationship with Oracle for its MySQL components must be considered a risk going forward.

InfiniDB, like Infobright, is built atop Oracle’s MySQL. (I posted about Infobright last year, and it also has made significant progress, drawing favorable comment in the open source community for its continuing maturation.)  Calpont’s relationship with Oracle must be seen as a risk factor..Oracle’s recent decisions about support raise questions about its interest in supporting anyone who is not an enterprise-class user of the Oracle-branded MySQL offering. Calpont has a deal through 2012 that includes an OEM license to integrate and use MySQL as the InfiniDB branded solution, and access to the MySQL channel. What will happen beyond that is clearly a concern. Read more of this post

Follow

Get every new post delivered to your Inbox.

Join 16,943 other followers