Hadoop Summit Recap Part Two – SELECT FROM hdfs WHERE bigdatavendor USING SQL

Probably the most widespread, and commercially imminent, theme at the Summit was “SQL on Hadoop.” Since last year, many offerings have been touted, debated, and some have even shipped. In this post, I offer a brief look at where things stood at the Summit and how we got there. To net it out: offerings today range from the not-even-submitted to GA – if you’re interested, a bit of familiarity will help. Even more useful: patience.

–more–

EMC Buys Greenplum – Big Data Realignment Continues

EMC’s acquisition of Greenplum, announced today as a cash transaction, reaffirms the obvious: the Big Data tsunami upends conventional wisdom. It has already reshaped the market, spawning the most ferment in the RDBMS (and non-R DBMS via the noSQL players) space in years. When I first posted on Greenplum over a year ago, I said that

Open source + capital has created an intriguing new model of rapid innovation in “mature” markets, and the database space – like BI – is not a done deal. It is indeed possible to escape the gravity well, if you execute. Greenplum is getting it done, and is among the new stars to watch.”

Why the open source reference? Greenplum uses a parallelization layer atop PostgreSQL (like Aster, another of the new breed of ADBMS.)

Now EMC has written the next chapter in that story. In the process, it adds a new piece (after literally dozens of others in the past few years) to its own portfolio, which already includes unstructured data (via Documentum) and virtualization (via VMWare), layered in among the industry-leading storage and information management pieces. Disruptive? You bet. Is EMC finished? I doubt it. Candidates? BI tools, ETL, MDM, data integration come to mind. Losers? At least one big one. Read on. Read more of this post

Does Informatica get a place at the head table?

From  Judith Hurwitz, president, Hurwitz & Associates (http://jshurwitz.wordpress.com).

Informatica might be thought of as the last independent data management company standing. In fact, that used to be Informatica’s main positioning in the market. That has begun to change over the last few years as Informatica can continued to make strategic acquisitions. Over the past two years Informatica has purchased five companies  — the most recent was Siperian, a significant player in Master Data Management solutions. These acquisitions have paid off. Today Informatica has past the $500 million revenue mark with about 4,000 customers. It has deepened its strategic partnerships with HP, Ascenture, salesforce.com, and MicroStrategy.  In a nutshell, Informatica has made the transition from a focus on ETL (Extract, Transform, Load) tools to support data warehouses to a company focused broadly on managing information. Merv Adrian did a great job of providing context for Informatica’s strategy and acquisitions. To transition itself in the market, Informatica has set its sights on data service management — a culmination of data integration, master data management and data transformation, predictive analytics in a holistic manner across departments, divisions, and business partners. Read more of this post

Programmers: Pervasive’s Parallelization Provides Punch, Profit

After 27 years of steady growth, Austin, Texas-based Pervasive (PVSW) has become a $47M annual run rate software provider. Its portfolio includes a “zero admin, light footprint database” (the former BTrieve, now PervasiveSQL), data integration software (for SaaS and on premises applications), and data synchronization products for such apps as salesforce.com, Quickbooks and Microsoft Dynamics CRM. In 2009, it began leveraging its DataRush processing engine as a product, providing a solution for companies that want to take advantage of multicore architectures to drive dramatically enhanced performance on much smaller footprints, for programming data services tasks such as aggregation, de-duplication, cleansing, integration, matching and sorting, as well as data mining and predictive analytics. Read more of this post

Kickfire Disrupts DW Economics, Targets Mainstream ADBMS Opportunities

In just 18 months, Kickfire has established itself as one of the most intriguing of the ADBMS insurgents. It espouses a radical go-to-market strategy: target the overwhelming majority of the market in the sub-5Tb space, and let others battle over who’s doing best at the top end, fighting over a small group of prospects. Kickfire also takes a radically different architectural approach: it uses an “SQL chip” to run much of its work in hardware, to dramatic effect in performance.

In April 2008, the Kickfire data warehouse appliance was announced at a MySQL conference, and simultaneously the company released 100Gb and 300Gb TPC-H benchmarks  that transformed price-performance expectations at the low end of the market. 6 months later the appliance became generally available, and 6 months after that had its first production reference. Since then, the company has had two encouraging quarters, and the product is now in the hands of some two dozen early adopters, a half dozen of whom are referenceable production sites. I spent some time recently with Kickfire CEO Bruce Armstrong to discuss the story so far, and Kickfire’s recent announcement of Kickfire 1.5 and the 3000 series appliance.

Read more of this post

IBM’s Smart Analytics System: More Than An Appliance?

When is an appliance not an appliance? When it’s more. On July 28, IBM’s Software Group and Systems and Technology Group (i.e., the hardware folks) hosted an analyst event to introduce the Smart Analytics System.The discussion began with a series of conversations about the value of “workload optimization,” or the effective tuning of processors, storage, memory and network components with software used for information management.  Not controversial, but hardly news. IBM claims to be raising the bar, though, with the promise of a system that is already tuned, and attuned to the needs of its purchaser, at a level far beyond appliances that other vendors have delivered: appliances, if you will, not only predesigned for specific use cases, but customized for specific instances of those use cases. It’s no accident that IBM never called the Smart Analytics System an “appliance.” Extending the Smart brand here is a powerful move, and IBM appears poised to make good on its promise. Read more of this post

GoldenGate Software Buy a Win for Oracle

Oracle today announced it is buying GoldenGate Software for an undisclosed sum, likely a couple of hundred million dollars. To revisit some facts from an earlier post, Goldengate had been in business 15 years, with some 500 customers, 4000 solutions deployed, and strong partnerships with Oracle, Teradata and Ingres on the database side, and Microstrategy and Amdocs in the app and BI space. Their message revolved around 3 key attributes of their changed-data-based replication technology: heterogeneity, real-time (log-based) performance, and high-volume transactional support. Read more of this post

Can GoldenGate Software Continue to Grow Transactional Replication?

GoldenGate Software may not be a well-known name, except in circles where transactional replication is a hot topic, but after 15 years in business, they have assembled a sizable base of some 500 customers, with 4000 solutions deployed, and partnerships with vendors as diverse as Teradata and Ingres on the database side, and Microstrategy and Amdocs in the app and BI space. Their message revolves around 3 key attributes of their changed-data-based replication technology: heterogeneity, real-time (log-based) performance, and high-volume transactional support (committed only.) And despite their notoriously closed-mouthed approach to their finances, it’s fair to say that they are generating tens of millions of dollars in revenue yearly (Hoover’s says $9.7M in 2007, but I believe that’s very low), so it’s evident the marketplace is interested. The big question is whether GoldenGate will invest to sustain and grow sales, or watch larger competitors competitors take their market away, now that they’re on the radar. Read more of this post

IBM InfoSphere Now Supports Informix and z

IBM’s InfoSphere Data Warehouse has been a steady growth asset. As IBM has created and acquired pieces of the infrastructure and progressively created a more complete, end-to-end offering, it has continued to add new customers to (and from) one of the largest installed bases in the world. In reviewing 2008, IBM CFO  Mark Loughridge asserted compound growth of 18% since 2006. For 2008 the claim is 100 more transactions, and 50 InfoSphere customers new to DB2 while in Q4 “distributed (non-mainframe) DB2″grew at 30% growth in constant currency terms. Read more of this post

Birst Hopes to Ride On-Demand BI Wave

Birst CEO Brad Peters checked in with IT Market Strategy to update us on the most recent developments in the on-demand BI market. They’ve been busy; when we last talked in January, Birst had just hired Randi DiPrima to head up a global partners program, and a significant new round of financing was freshly deposited. Read more of this post