Hadoop 2013 – Part Three: Platforms

In the first two posts in this series, I talked about performance and projects as key themes in Hadoop’s watershed year. As it moves squarely into the mainstream, organizations making their first move to experiment will have to make a choice of platform. And – arguably for the first time in the early mainstreaming of an information technology wave – that choice is about more than who made the box where the software will run, and the spinning metal platters the bits will be stored on.There are three options, and choosing among them will have dramatically different implications on the budget, on the available capabilities, and on the fortunes of some vendors seeking to carve out a place in the IT landscape with their offerings.

– more –

Apache Hadoop 1.0 Doesn’t Clear Up Trunks and Branches Questions. Do Distributions?

In early January 2012, the world of big data was treated to an interesting series of product releases, press announcements, and blog posts about Hadoop versions.  To begin with, we had the announcement of Apache version 1.0 at long last, in a press release. Although there were grumblings here and there in the twittersphere that changes to release numbers are meaningless, my discussions with Gartner’s enterprise customers indicate otherwise. Products with release numbers like 0.20.2 make the hair on Procurement’s neck stand on end, and as Hadoop begins to get mainstream attention (Gartner’s clients, see Hype Cycle for Data Management 2011), IT architects and executives find such optics quite important. Hadoop is moving beyond pioneers like Amazon, Yahoo! and LinkedIn into shops like JP Morgan Chase, and they pay attention to such things.

…more…

Cloudera-Informatica Deal Opens Broader Horizons for Both

Cloudera‘s continuing focus on the implications of explosive data growth has led it to another key partnership, this time with Informatica. Connecting to the dominant player in data integration and data quality expands the opportunity for Cloudera dramatically; it enables the de facto commercial Hadoop leader to find new ways to empower the “silent majority” of data. The majority of data is outside; not just outside enterprise data warehouses, but outside RDBMS instances entirely. Why? Because it doesn’t need all the management features database management software provides – it doesn’t get updated regularly, for example. In fact, it may not be used very often at all, though it does need to be persisted for a variety of reasons. I recently mentioned Cloudera’s success of late; it’s going to be challenged by some big players in 2011, notably IBM, whose recent focus on Hadoop has been remarkably nimble. So these deals matter. A lot. The Data Management function is being refactored before our eyes; both these vendors will play in its future. Read more of this post

Microsoft Leaps Late, Lags with SQL Server PDW

Microsoft chose a user group meeting, Professional Association for SQL Server (PASS), for the rollout of its long-awaited, and late, SQL Server 2008 R2 Parallel Data Warehouse (note, yet again, how foolish it is for vendors to trap themselves with dates in product names.) PDW is late to market; there are other MPP DBMS players there already, and Microsoft is behind in functionality compared to some of them. Some of the most eagerly–awaited features are evidently not slated for the first release. It’s also far behind its originally planned ship date following the acquisition of DatAllegro in 2008. Read more of this post

Oracle’s Exadata Refresh Ups Ante on Technology and Selling Strategy

The Exadata marketing story is unrelenting, and Oracle backed it with plenty of happy customers for analysts to query at Open World this year. The stories were compelling; I’ll mention a few below. In the analyst pitch, we were shown a couple of dozen logos – good for a still relatively new high-end, long sales cycle, longer still production ramp up, product. The numbers are not Teradata rates yet, but CEO Larry Ellison claims a $1.5B pipeline.  Whether you believe it or don’t, he’s telling the world – and if he misses by much, Wall Street will spank the stock, so personally I doubt that he’s pushing too far past his real expectations. The big news, of course, was a refresh of the product itself, as Oracle gets deeper into the power of leveraging hardware and software design together. Read more of this post

Calpont’s InfiniDB – Another ADBMS Insurgent Arises

Calpont, rapidly emerging as yet another contender in the ADBMS sweepstakes, has announced version 2.0 of InfiniDB, its columnar MPP offering over shared storage. The value proposition hits now-familiar themes: high-performance query, fast data loading, data compression, and parallelized user defined functions (UDFs), all of which are becoming key checkoff capabilities. InfiniDB also hits hard on pricing, which it says dramatically undercuts that of its competitors. And a 30-day free trial of the enterprise edition sweetens the offer. For those comfortable with open source, the 2.0 release of the  community edition is available as well. Calpont says the community edition (which is limited to a single server but is otherwise database feature-complete) has had 15,000 downloads. But the company’s relationship with Oracle for its MySQL components must be considered a risk going forward.

InfiniDB, like Infobright, is built atop Oracle’s MySQL. (I posted about Infobright last year, and it also has made significant progress, drawing favorable comment in the open source community for its continuing maturation.)  Calpont’s relationship with Oracle must be seen as a risk factor..Oracle’s recent decisions about support raise questions about its interest in supporting anyone who is not an enterprise-class user of the Oracle-branded MySQL offering. Calpont has a deal through 2012 that includes an OEM license to integrate and use MySQL as the InfiniDB branded solution, and access to the MySQL channel. What will happen beyond that is clearly a concern. Read more of this post

EMC Jumps Into ADBMS Appliance Game

The Data Computing Appliance, first deliverable from EMC’s acquisition of Greenplum, was announced last month, only 75 days after the acquisition closed, and it doesn’t lack for ambition.  Pat Gelsinger, President and Chief Operating Officer, EMC Information Infrastructure, pointed to the high level opportunity: unlocking the “hidden value” of enormous and growing data assets every company is increasingly holding, and often failing to leverage. The appliance will reach many hitherto untapped resources in the data centers that EMC occupies. Adding EMC’s manufacturing, sales and marketing, and reference architectures to the Greenplum IP brings what Gelsinger calls Greenplum’s “first phase” to its completion. And begins what is likely to be a sizable battle with Oracle, Teradata and IBM, if EMC mounts campaigns and spending to match its ambitious vision. Read more of this post

IBM Acquires Netezza – ADBMS Consolidation Heats Up

IBM’s bid to acquire Netezza makes it official; the insurgents are at the gates. A pioneering and leading ADBMS player, Netezza is in play for approximately $1.7 billion or 6 times revenues [edited 9/30; previously said "earnings," which is incorrect.] When it entered the market in 2001, it catalyzed an economic and architectural shift with an appliance form factor at a dramatically different price point. Titans like Teradata and Oracle (and yes, IBM) found themselves outmaneuvered as Netezza mounted a steadily improving business, adding dozens of new names every quarter, continuing to validate its market positioning as a dedicated analytic appliance. It’s no longer alone there; some analytic appliance play is now in the portfolio of most sizable vendors serious about the market. Read more of this post

Kalido “Cascades” Continue Cadence on Designed DW Development

Kalido‘s ongoing evangelization of automation for governed, designed data warehouses has delivered fine results for the small, Massachusetts-based firm. In a recent conversation, the team shared recent results: a profitable fiscal year, with a Q4 that was up 35% and momentum that carried into the traditionally slow Q1 with 25% year over year growth. Since I last discussed Kalido at the time of its virtual conference a year ago, new name sales in the US and Europe as well as add-on business in existing accounts are a healthy sign . New partnerships, new data source support,  and a new release all are likely to sustain and even increase the momentum in the  autumn and winter selling seasons. Read more of this post

More TDWI Notes – ParAccel Rolling On, HP Stalled, Vertica Leading Insurgents

On my second day at TDWI, I was in meetings all day – events like this are a great opportunity for analysts to catch up with many of the companies they follow at one time, and this particular one was packed with sponsors. Congrats to the folks who sell sponsorships – they had a packed exhibit hall, and a lot of very interested attendees. I got a chance to chat at a few booths (all buzzing), ask a few attendees some real-world questions (and was asked some surprising ones myself), and get a sense of the workload in the trenches (heavy and growing.)

Read more of this post

Follow

Get every new post delivered to your Inbox.

Join 110 other followers