Vertica Projects Leadership, Embraces MapReduce (Sorta)
August 11, 2009 1 Comment
With the August announcement of Vertica Analytic Database 3.5, Vertica is laying claim to leadership of the new ADBMS vendors. With its most recent numbers – several dozens of customers are now in production and the company expects to pass 100 this year – the assertion bears thinking about. Driving forward with an aggressive release strategy, Vertica is showing its maturity and increasing ability to challenge the old school leaders like Teradata and Netezza – but with a software-only strategy. This agility allowed it to offer early support for release 3.5 in quick succession after its last release, with GA scheduled for later this year.
It was only a few months ago that Vertica announced version 3.0, with a large set of significant advances including SQL-99 functions, faster data load speeds, improved security including SSL client security encryption and LDAP/Kerberos/Active Directory integration and a variety of performance boosts. Keeping the pace up is a powerful way to trumpet leadership, and key announcements made at the TDWI event in San Diego included:
- Flexstore– a refinement to the column store model adds column group storage, which can improve performance for frequently used pairs like bid/ask or columns with low numbers of unique values that can fit into Vertica’s storage block size. Vertica also now claims to optimize the placement of “hot” data Most used) into the fastest physical locations (best I/O), like Netezza and Teradata. As this matures, Vertica expects to get into a rich ILM model with a hierarchy of storage speed from SSD to the slowest disks.
- MapReduce support, but with a difference. Unlike Greenplum and Aster, who are bringing it into the database itself, Vertica is providing a streaming connection to Hadoop instances (the open source implementation of MapReduce; Vertica is contributing the adapter to the community). This architecture mirrors usage patterns we’ve seen, and which Vertica asserts its customers have told them they want. One scenario: use your ADBMS to retrieve stored data, pass it to Hadoop for analysis by staff with different skill sets from the typical ADBMS users, and then bring result sets back. A separate hardware for the Hadoop sandbox is fairly typical among early adopters today, and via a Cloudera partnership, Vertica can offer a deployment architecture that doesn’t break the bank. Curt Monash does the usual excellent summary of Hadoop issues in his blog.
- IPVS (IP Virtual Server)-based load balancing. For those of us not attuned to Linux kernel stuff: IPVS implements transport-layer load balancing inside the Linux kernel, directing requests for TCP/UDP based services to the real servers, and makes services of the real servers to appear as a virtual service on a single IP address. Vertica uses round robin switching here; it must be turned on but then it’s invisible.
- Perl and Python support over ODBC. If you know what that means, you’ll obviously be happy about it; if not, your programmers will be.
- New verticals (no pun intended). Vertica is starting to get some traction in retail, which is new (and fertile) ground. As the company hits a $15M run rate or better, continuing growth will require getting into new markets. More broadly, marketing execution is visibly excellent – 40 referenceable customers are called out on Vertica’s website already, an impressive total for a company claiming not much more than twice that many. Frequent, content-rich briefings for industry analysts. Presence at shows like TDWI – although it was not the best attended show, Vertica’s Dave Menninger told me he was pleased with the leads they did get – and some competitors, notably Greenplum, were not present, leaving a less crowded field as Netezza, ParAccel and Sybase made bids of their own for attention, including announcements. A little applied revenue can make a big difference. [corrected error – Aster was a sponsor and had a booth.]
Vertica continues to talk about pricing in terms of cost per terabyte. The market is still sorting various models out, but this is a useful one, because it allows one price: development, test and production all come in the same number. Costs are more predictable because change occurs in synch with data growth, not by jumps when newer hardware is needed – there are no new license fees for new nodes. And if you’ve ever suddenly found yourself in a different price category in processor-based pricing with other products, you’ll appreciate this approach. In these times of explosive data growth, it certainly won’t hurt Vertica’s revenue stream; at the same time, it has an intuitive feel that makes sense to buyers.
I’ve said before that the next 18 months or so will be an exciting battle among an array of new ADBMS players. Vertica is claiming pole position, and they have excellent prospects in the battle ahead. Stay tuned.