IBM Acquires Netezza – ADBMS Consolidation Heats Up

IBM’s bid to acquire Netezza makes it official; the insurgents are at the gates. A pioneering and leading ADBMS player, Netezza is in play for approximately $1.7 billion or 6 times revenues [edited 9/30; previously said “earnings,” which is incorrect.] When it entered the market in 2001, it catalyzed an economic and architectural shift with an appliance form factor at a dramatically different price point. Titans like Teradata and Oracle (and yes, IBM) found themselves outmaneuvered as Netezza mounted a steadily improving business, adding dozens of new names every quarter, continuing to validate its market positioning as a dedicated analytic appliance. It’s no longer alone there; some analytic appliance play is now in the portfolio of most sizable vendors serious about the market.

Now with more than 350 customers, Netezza’s key markets include telco, financial services, government and retail, all places where IBM has a substantial presence. It’s strong in North America, and has made good inroads in Japan and lately Europe as well. All these will be dramatically enhanced and extended with the IBM marketing and sales engine, if Big Blue gets the positioning right. Some of Netezza’s business was coming through a relatively new distribution partnership with NEC in Asia; the status of that relationship is clearly going to be in play.

The positioning challenge? IBM’s already available Smart Analytics Systems: the mainframe System z-based 9600, POWER-based 7600, and the x-based 5600 have been in-market for a while; their positioning was already somewhat muddled, and Netezza is another x-based product. Forrester’s Jim Kobielus blogged that Netezza’s “$20,000 per terabyte is the same starting price that Oracle asks for Exadata-based solutions,” and that “IBM has avoided the DW price war, playing a different, and equally valid, strategy of delivering complete solution appliances for both its professional services and its partner ecosystem to deliver and customize.” Jim’s right – though I would add that we have no way of telling whether in fact IBM has also avoided success, since they tell the analyst community nothing about the uptake of their offerings by comparison to the hundreds Netezza has sold in direct competition with them and others like Oracle – whose Exadata has evidently been making strong early inroads. Where does Netezza fit in the portfolio? For example, will its ambitions for EDW status will be relegated to the “bigger” platforms (P and z)?

Today Netezza delivers two platforms: TwinFin for large data volumes, and Skimmer for smaller ones and for testing. In my recent conversations with VP of Product Management and Marketing Phil Francisco, he describes its “sweet spot” today as being in the 10-20 TB range. Sampling and aggregation take too much detail away, the company says: to do analytics around the “long tail” of distribution in your data sets, you need more than just a sample. Collecting 10s or 100s of terabytes and running predictive analytics and optimization techniques provides more insight than conventional BI and dashboard reports. But those high-end volume numbers are where IBM positions the 7600 and 9600 – so even its other platforms are challenged here.

Netezza began with a proprietary hardware model; its hybrid architecture has an SMP system as a head node to perform SQL plan building and administration in front of an MPP array of worker nodes. But now in its fourth generation, it has moved to a commodity platform, built on “S-blades”: standard IBM blades plus a daughtercard sporting a field programmable gate array (FPGA) processor that performs smart storage filtering operations, not unlike the ones now being emulated by Oracle, though the latter’s storage level functions are not as rich.  Netezza’s storage innovations are based on “Zone Maps,” which keep track of key statistics such as the minimum and maximum value of columns in each storage extent. The Zone Map identifies which data falls in the desired data range, often avoiding general table scans and the associated enormous I/O overhead they create. FPGAs make further “smart” decisions: they PROJECT only the columns in the SELECT statement and RESTRICT to retrieve only the rows in the WHERE clause.

Multiple software “engines” sit atop this inside the database for  matrix manipulation, Hadoop/MapReduce, and R. Netezza provides wrappers for java, python, Fortran, C, and C++, and wizards for creating UDFs that build the surrounding code automatically. It recently upped the analytic ante by adding i-Class, a library of functions that scale to use available memory and will be maximally parallelized, and are callable from any language Netezza supports. Taking it a step further, the vendor offers wrappers around a set of functions from the GNU Scientific library – 2000 of them – also callable from a SQL UDX (Netezza’s UDF). Another set of functions in the R community’s CRAN repository (which contains 1900 packages, and 4000-5000 functions) are available, but have not been explicitly parallelized by Netezza – they may or may not have been written assuming a parallelized MPP platform. Still, they add flexibility and shorten time to delivery for developers.Will IBM port all these functions to DB2? Essentially they are simply algorithms implemented close to the data in various UDF-like forms – though “simple” does a disservice to the quality and power of the work involved. Or will the existing Smart Analytics Systems continue to be marketed primarily on the basis of their convenient pre-integrated setup and library of industry models? There are many decisions to be made.

Netezza has strong partnerships with MicroStrategy, Business Objects, SAS, and Tibco. Several plays were in progress that will require rationalization with IBM’s offerings: an agreement with Composite for the Netezza Data Virtualizer to provide federation across multiple Netezza appliances and a partnership with Cloudera to enable data movement and transformation between TwinFin and Cloudera’s Distribution for Hadoop (CDH) among them. These relationships are not unfamiliar territory for IBM, though Curt Monash, as always, offers some perceptive speculations in his post, including the likelihood that SPSS will displace Netezza’s SAS partnership. While I agree, SAS’s customer base dwarfs SPSS’ and I don’t imagine the IBM’s BAO services team will leave all that opportunity on the table.

Branding is also a question worth considering, and was raised on the announcement call with analysts. IBM is likely to keep “Netezza”, whose brand has established value, as it did Informix and Cognos, as a secondary brand underneath the IBM label, at the lower end. But it will have much work to do, and of course none of it can start till the deal is done. Others may jump in but I doubt that IBM will be outbid unless someone goes off the deep end in valuation.

And the consolidation continues: Kickfire to Teradata, Greenplum to EMC, and this are only the beginning. Aster, Kognitio, ParAccel and Vertica, among others, have had success with dozens to hundreds of customers. And the big jewel, Teradata, is hardly out of reach for the likes of HP. But that would require a vision we have not seen thus far from that quarter. What is clear is that the market – and the leading vendors – have accepted the key vector of change in the 21st century database – the emergence of the specialty analytic platform. It’s hardly mainstream yet, notwithstanding the literally thousands of installations. There are many times that number of shops who still expect to go to the same old one-size-fits-all model, based on recent survey work Colin White and I have recently completed. They have yet to be convinced. If IBM can get its messaging in gear around Netezza, it can be a foil to Oracle’s Exadata as the tide rises. So far, Larry’s yacht seems to have caught the wind much more effectively.

Disclosures: IBM is a client of IT Market Strategy. Netezza is not, but sponsored a multi-vendor study we are conducting.

The portfolio of specialized engines also includes ones for Hadoop/MapReduce and R. Netezza has an SDK for Eclipse IDE plugin. It provides wrappers for java, python, Fortran, C, and C++, and wizards for creating UDFs that builds the surrounding code automatically. For the extensive community using R, Netezza provides a GUI.

Published by Merv Adrian

Independent information technology market analyst and consultant, 40 years of industry experience, covering software in and around the data management space.

5 thoughts on “IBM Acquires Netezza – ADBMS Consolidation Heats Up

  1. I may be old school but I believe we lose customer choice and innovation when we accept the compromise of tight integration. Real-time business intelligence requires innovation not integration. I think your comments above are all valid and especially when it comes down to embedding analytics and connecting them to hardware, I agree this gives users power today. But only for today, and maybe not even all the way to lunch time. I think it drives power in a paradigm where you accept other significant compromises. If the software engines you describe above were independent and open,worked with each other, and crucially other vendor’s software then I would declare victory, for we would have software. When we tie them in a proprietary format at the level appliances have to I think we get a short term win and then ultimately run at the speed of the slowest component. The core issue is as a customer you rely on your vendor to optimize every element, and then that they have picked all the right elements. The vendor gets one thing wrong and the next you have 70 lawnmowers strapped together with duct tape where one of the lawnmowers has a broken wheel. The other 69 just go slower.

    In Steve Lohr’s recent NY Times post, “I.B.M.’s Hybrid Strategy in Business Intelligence”, Lohr says:

    The real-time model of business intelligence, though, requires not just software, but a tight integration with hardware. I.B.M. has been working on this for years with tools tailored for high-speed processing of real-time data, like its System S technology for what is called “stream processing” — parsing data in streams in rather than after it is stored in data bases.

    If I may, as the teenagers in my house say, “no offense but…….” this statement is just plain inaccurate. “real-time model of business intelligence, though, requires not just software, but a tight integration with hardware”. This is wrong. What the real-time model for business intelligence needs is innovation not integration.

    I have worked in the software industry for over two decades and I love the business model. I fundamentally believe what Larry Ellison of Oracle used to opine (until he bought a hardware company): that software should be OPEN (anyone remember an Oracle Open World where he said that?)

    Software working on multiple platforms is good for customers as it forces the software industry to drive innovative solutions that work where and the way the customer wants them. I agree IBM has been working on tight integration between hardware and software. They make hardware.

    If you make software you should want it to work in as many environments as possible. It expands your market. You can sell to more customers. If you run on multiple environments you can have to have the best software in every environment. Would customers want the third-best? Openness drives innovation. Openness drives customer choice.

    We saw technology vendor lock in the 1970’s and 1980’s. Do you have one of the following mainframes in your data centre? Burroughs, UNIVAC, NCR, Control Data, Honeywell, General Electric and RCA? I doubt it very much. Unix based systems put paid to that with the portability of databases and applications. It opened up the data centre.

    I wrote earlier about trying to squeeze the last piece of performance out of old technology by tying hardware and software ever tighter together. Does that sound like a good idea? Tying things tighter together to make them faster? Does Chevron make cars? If they did, how much would a gallon of gas be?

    What if someone releases a new chip architecture that is better, faster, cheaper? A new operating system? New disk storage? Squeezing the last drop of performance out of old technology is not what is required. Openness drives innovation. It forces us to be creative and produce the best software we can. Abstracted from memory, CPU, disk, the operating system, we create the best software we can.

    Katherine Noyes of PC Week wrote a great article on this last week. She highlighted the Android-based tablets are better for business users than the Apple iPad precisely because they are based on Linux and their openness will drive innovation, reduce cost and increase flexibility over Apple’s closed system. The same is true here.

    I have long loved the software industry. We innovate, we create, we make something that would have cost $1M for a customer to write themselves and then sell it for $100K to 30 people. We all win. It is a model where everyone should win. It doesn’t always work, the principle remains sound. Yes there have been software vendors that over-promised and under-delivered; tighter integration in the stack does not stop this. There have also been many software companies that have created the tools to drive business to new heights. Let’s be the best we can be, not the second best.

    1. Mike, I imagine your role as the COO of Sand Technologies (a software competitor to this offering) has some bearing on your point of view. 😉

      But that said, we’re talking about a perennial tradeoff, and one that will never go away. There will always be attempts to create a tightly integrated, optimized stack to get the best performance available at the moment to customers who have neither the patience nor the resources to be their own systems integrator. At different times in the history of this industry that we both love, some offerings like that make sense. And of course, disruptive innovation happens in both dimensions – hardware and software – continuously, making it possible for the customers who DO have patience and technical knowhow to take advantage of those innovations. We’ve always had both, and we always will as long as the buyer population is diverse.

      You make an eloquent case for independent software companies, and I completely agree. But I also believe that packaging and integration have value. May both models continue to prosper.

  2. thank you for a fair and balanced view. Fair to say I may be biased, that doesn’t make me wrong 🙂

    I think we may be abotu to announce something that may give you the best of both worlds in one solution. Watch this space.

  3. Mike makes a great argument for why independent software – or for that matter hardware – development firms exist and why they will not go away anytime soon. Anyone who has been around the industry awhile surely has seen how lack of competition can stiffle innovation and lead to vendor lock-in but also how too many point solutions and lack of integration can add complexity, slow productivity and raise costs. I also think there are certain classes of software solutions that are better integrated than others and we can ceratinly argue if that includes the analytics space or not. Regardless, IBM/Netezza makes a very plausible argument for how companies can benefit from implementing a single tool across multiple departments including Finance, Marketing, LOBs or IT. Ultimately, each organization has to decide what’s best for their unique situation.

    1. Thanks for jumping in, Gary. I agree that some combinations make more sense than others. These days the interface to the physical storage is getting lots of attention, as is interconnect technology in scaleout systems. And it only makes sense that several vendors are seeking to optimize at that boundary.

Leave a Reply