IBM’s bid to acquire Netezza makes it official; the insurgents are at the gates. A pioneering and leading ADBMS player, Netezza is in play for approximately $1.7 billion or 6 times revenues [edited 9/30; previously said “earnings,” which is incorrect.] When it entered the market in 2001, it catalyzed an economic and architectural shift with an appliance form factor at a dramatically different price point. Titans like Teradata and Oracle (and yes, IBM) found themselves outmaneuvered as Netezza mounted a steadily improving business, adding dozens of new names every quarter, continuing to validate its market positioning as a dedicated analytic appliance. It’s no longer alone there; some analytic appliance play is now in the portfolio of most sizable vendors serious about the market.
Now with more than 350 customers, Netezza’s key markets include telco, financial services, government and retail, all places where IBM has a substantial presence. It’s strong in North America, and has made good inroads in Japan and lately Europe as well. All these will be dramatically enhanced and extended with the IBM marketing and sales engine, if Big Blue gets the positioning right. Some of Netezza’s business was coming through a relatively new distribution partnership with NEC in Asia; the status of that relationship is clearly going to be in play.
The positioning challenge? IBM’s already available Smart Analytics Systems: the mainframe System z-based 9600, POWER-based 7600, and the x-based 5600 have been in-market for a while; their positioning was already somewhat muddled, and Netezza is another x-based product. Forrester’s Jim Kobielus blogged that Netezza’s “$20,000 per terabyte is the same starting price that Oracle asks for Exadata-based solutions,” and that “IBM has avoided the DW price war, playing a different, and equally valid, strategy of delivering complete solution appliances for both its professional services and its partner ecosystem to deliver and customize.” Jim’s right – though I would add that we have no way of telling whether in fact IBM has also avoided success, since they tell the analyst community nothing about the uptake of their offerings by comparison to the hundreds Netezza has sold in direct competition with them and others like Oracle – whose Exadata has evidently been making strong early inroads. Where does Netezza fit in the portfolio? For example, will its ambitions for EDW status will be relegated to the “bigger” platforms (P and z)?
Today Netezza delivers two platforms: TwinFin for large data volumes, and Skimmer for smaller ones and for testing. In my recent conversations with VP of Product Management and Marketing Phil Francisco, he describes its “sweet spot” today as being in the 10-20 TB range. Sampling and aggregation take too much detail away, the company says: to do analytics around the “long tail” of distribution in your data sets, you need more than just a sample. Collecting 10s or 100s of terabytes and running predictive analytics and optimization techniques provides more insight than conventional BI and dashboard reports. But those high-end volume numbers are where IBM positions the 7600 and 9600 – so even its other platforms are challenged here.
Netezza began with a proprietary hardware model; its hybrid architecture has an SMP system as a head node to perform SQL plan building and administration in front of an MPP array of worker nodes. But now in its fourth generation, it has moved to a commodity platform, built on “S-blades”: standard IBM blades plus a daughtercard sporting a field programmable gate array (FPGA) processor that performs smart storage filtering operations, not unlike the ones now being emulated by Oracle, though the latter’s storage level functions are not as rich. Netezza’s storage innovations are based on “Zone Maps,” which keep track of key statistics such as the minimum and maximum value of columns in each storage extent. The Zone Map identifies which data falls in the desired data range, often avoiding general table scans and the associated enormous I/O overhead they create. FPGAs make further “smart” decisions: they PROJECT only the columns in the SELECT statement and RESTRICT to retrieve only the rows in the WHERE clause.
Multiple software “engines” sit atop this inside the database for matrix manipulation, Hadoop/MapReduce, and R. Netezza provides wrappers for java, python, Fortran, C, and C++, and wizards for creating UDFs that build the surrounding code automatically. It recently upped the analytic ante by adding i-Class, a library of functions that scale to use available memory and will be maximally parallelized, and are callable from any language Netezza supports. Taking it a step further, the vendor offers wrappers around a set of functions from the GNU Scientific library – 2000 of them – also callable from a SQL UDX (Netezza’s UDF). Another set of functions in the R community’s CRAN repository (which contains 1900 packages, and 4000-5000 functions) are available, but have not been explicitly parallelized by Netezza – they may or may not have been written assuming a parallelized MPP platform. Still, they add flexibility and shorten time to delivery for developers.Will IBM port all these functions to DB2? Essentially they are simply algorithms implemented close to the data in various UDF-like forms – though “simple” does a disservice to the quality and power of the work involved. Or will the existing Smart Analytics Systems continue to be marketed primarily on the basis of their convenient pre-integrated setup and library of industry models? There are many decisions to be made.
Netezza has strong partnerships with MicroStrategy, Business Objects, SAS, and Tibco. Several plays were in progress that will require rationalization with IBM’s offerings: an agreement with Composite for the Netezza Data Virtualizer to provide federation across multiple Netezza appliances and a partnership with Cloudera to enable data movement and transformation between TwinFin and Cloudera’s Distribution for Hadoop (CDH) among them. These relationships are not unfamiliar territory for IBM, though Curt Monash, as always, offers some perceptive speculations in his post, including the likelihood that SPSS will displace Netezza’s SAS partnership. While I agree, SAS’s customer base dwarfs SPSS’ and I don’t imagine the IBM’s BAO services team will leave all that opportunity on the table.
Branding is also a question worth considering, and was raised on the announcement call with analysts. IBM is likely to keep “Netezza”, whose brand has established value, as it did Informix and Cognos, as a secondary brand underneath the IBM label, at the lower end. But it will have much work to do, and of course none of it can start till the deal is done. Others may jump in but I doubt that IBM will be outbid unless someone goes off the deep end in valuation.
And the consolidation continues: Kickfire to Teradata, Greenplum to EMC, and this are only the beginning. Aster, Kognitio, ParAccel and Vertica, among others, have had success with dozens to hundreds of customers. And the big jewel, Teradata, is hardly out of reach for the likes of HP. But that would require a vision we have not seen thus far from that quarter. What is clear is that the market – and the leading vendors – have accepted the key vector of change in the 21st century database – the emergence of the specialty analytic platform. It’s hardly mainstream yet, notwithstanding the literally thousands of installations. There are many times that number of shops who still expect to go to the same old one-size-fits-all model, based on recent survey work Colin White and I have recently completed. They have yet to be convinced. If IBM can get its messaging in gear around Netezza, it can be a foil to Oracle’s Exadata as the tide rises. So far, Larry’s yacht seems to have caught the wind much more effectively.
Disclosures: IBM is a client of IT Market Strategy. Netezza is not, but sponsored a multi-vendor study we are conducting.