You can set performance records in a virtualized environment – that’s the message of the new 1 Tb TPC-H benchmark record (scroll down to see the 1Tb results) just released by ParAccel and VMware. Running on VMware’s vSphere 4, the ParAccel Analytic Database (PADB) delivered a one-two punch: not only the top performance number for a 1 terabyte (TB) benchmark, but the top price-performance number as well. The results in a nutshell: 1,316,882 Composite Queries per Hour (QphH), a price/performance of 70 cents/QphH, and a data load rate of over 3.5 TBs per hour. ParAccel moved quickly to promote the result; oddly, VMware seems to have been asleep at the switch, with no promotion on its site as the release hit the wires, and a bland quote from a partner exec in the release itself.
PADB version 2.5 was used with 40 servers on an 80-node cluster (40% fewer servers than the prior record holder) running VMware. Barry Zane, CTO of ParAccel, told me: “We set up each server with 2 VMs, and configured each one so it used half the memory of the physical server, 4 of 8 of the cores, and half of the disk I/O. ” The resulting configuration behaved better than running physical Linux, he says.
We measured it a couple of times; the reason appears to be that VMware does a better job of mapping memory to CPU than Linux does. VMware is more NUMA-aware than Linux is; the virtual CPUs bound themselves to the physical ones and to the memory connected to them.”
For non-hardware-oriented readers, NUMA is non-uniform memory access; the Wikipedia reference at the link does a good job of explaining it.
The value of being able to achieve such performance in a virtualized environment can hardly be overstated. The history of the database benchmark game is built on processor and storage infrastructures tuned for and dedicated to running the tests, with no contention to get in the way. As workloads become ever more mixed, the dedicated nature of the environment has been one reason for questions about benchmark credibility. VMware and ParAccel have raised the bar. While these TPC benchmarks didn’t run under mixed workload conditions (they never do), the predictability of performance given known available system resources this result implies makes for a very powerful tool in the hands of operators who must plan to meet the variable needs of their users.
For example, if there were 4 database instances running in the cloud and you told VMware to give half your computing resources to one and divide the rest equally among the others, you’d want to be able to see the impact on them in a predictable, linear manner. “That will be our next step to prove,” says Zane. “A lot of virtualization today is done in a Wild West manner. You want to orchestrate the database, the hardware and virtualization to play together. You don’t want one piece sharing a core and none of the others doing so.” Having VMware own this responsibility means that in a shop with multiple database and other servers, one set of skills and one product can become the focus of that effort. This can be a big factor in private and public multi-tenant scenarios. It adds a new dimension to the question of workload management – a database vendor-independent one.
Ten 1 TB TPC-H benchmarks (none clustered) have been published since last July, using Microsoft and Sybase databases. Microsoft’s benchmarks ran on HP, Unisys, and Sun equipment. Other database vendors have avoided the fray, but I expect 2010 will see a flurry of new tests. The ADBMS upstarts, for example, continue to compete on their ability to demonstrate both performance and price/performance on this admittedly artificial playing field. Microsoft has a new SQL Server release; IBM just announced new Smart Analytics Systems designed for data warehousing, and DB2 has not been seen in the TPC-H game. Oracle has regularly delivered benchmarks, and as it begins to ramp up the Exadata message and its hardware marketing strategy, expect more from Redwood Shores. Should we expect to see Netezza, Teradata appliances, and others? They need to show up if the big players make enough noise about their appliance plays. Certainly Kickfire has made hay, especially at the low end, with its outstanding price/performance numbers.
Hardware vendors invest significantly in these tests by providing platforms and engineering support. Years ago, the competition was as much about platform power as software capabilities. Going forward, Oracle is somewhat less likely to partner with its database competitors than, say, IBM, whose hardware was used for one of the Sybase benchmarks. The other Sybase benchmark was on Sun equipment, but that’s not likely to happen again. The OS used is also an issue – the last ten benchmarks are dominated by Windows, but Solaris, AIX and Linux are in there too. The ability of an OS to support high performance is worth reaffirming at times, and VMware now joins that conversation with a bang. While not exactly an OS supplier, or a hardware vendor, VMware is very interested in showing that it can be a player in achieving great performance, and this is a way to show that. Jos van Dongen takes a look at the resources that are still used to execute what must be considered a relatively small benchmark in a solid post here. He makes an effective point of the seeming overkill involved in throwing this much hardware at the problem, but of course the price/performance numbers, reflecting the continual lowering of costs, speak volumes about what that means to customers, and the ability to reuse the resources adds a new dimension to the question.
From a historical timing perspective, the top ten 1TB TPC-H benchmarks by performance go all the way back to 2007 (number 10 on the list was from 2005 till this one appeared.) And while Oracle is not on the “most recent” list, it appears in no less than 4 of these top results, mostly in 2009 around the 11g releases. Having made its mark, Oracle turned its attention elsewhere. ParAccel, Microsoft, Sybase and the rarely seen Exasol round out the top 10 list. Clustered results from Oracle, Exasol and ParAccel top the performance list, and in general clustered results tend to deliver much better price/performance than unclustered, having dramatically changed that game in the past few years.
Load time, while not as obvious as the headline performance and price/performance numbers, is also important for data center staff. One of the advantages the emerging ADBMS vendors have brought to the table is the elimination of manual partitioning (and its concomitant requirement for maintenance) and elaborate indexing schemes. ParAccel and others refer to this as “load-and-go,” and customers verify this – it’s one of the features that tend to help ADBMSs win POCs over the incumbent general-purpose DBMS products. As van Dongen points out, there was more hardware used to load onto – still, as the cost of that hardware continues to decline and its performance improves, it’s unsurprising that vendors would exploit that fact. The good news for users (with money) is that this is a problem you can “throw hardware at” to improve the time-to-usage.
Price/performance is key for buyers, and the objective is that tests should show how much you would spend; TPC rules dictate that hardware and software pricing must reflect real costs. If you have a large data center, some of which is used for the actual test, do you calculate price/performance on the basis of the part of the system that was actually used, or the cost of the whole system? This has been a topic for the council in the past. Similarly, the cost of the rack counts – even if you’ve only filled half of the rack with blades. Now, as virtualized systems compete, one wonders how price/performance thinking will evolve; it certainly will need to.
This is not the only outstanding issue for the TPC to consider; another is the dearth of TPC-H benchmarks at the higher – and now much more typical – data volume levels. Companies have not been allowed to publish at high scale factors for several years. This TPC meeting report from 2004 mentions issues with the reliability of the TPC-H data generation software. This problem has not been resolved. One benchmark was published at the 30 Tb level in 2007, and it has remained alone at that level until the problem is solved.
ParAccel and VMware have taken a substantial leap. Though this post is mostly about the role virtualization played, it’s key to note that PADB continues to demonstrate its ability to improve and deliver high performance without extensive requirements for careful index and partitioning design and maintenance. In this post I didn’t get into the value of its compression, or its avoidance of results caching, apparent in some benchmark runs by other vendors. But those are topics worth talking about if you engage with ParAccel. They certainly belong on your short list. I hope that VMware sees the value of this in its own messaging and exploits it; they have a new weapon in the messaging war, but it won’t last forever.
Disclosures: ParAccel and VMware are not clients of IT Market Strategy.