ParAccel Rocks the TPC-H – Will See Added Momentum

ParAccel, another of the analytic database upstarts, has weighed in on Sun hardware with a record-shattering benchmark that its competitors have thus far avoided – the 30 TB TPC-H. It’s been two years since anyone has published a 30 TB TPC-H, and only 10 of any size (all smaller) have been published in the past year. One can scoff (many do) at this venerable institution, but TPC benchmarks are a rite of passage, and a badge of engineering prowess. The ParAccel Analytic Database (PADB) has set new records, raising its profile dramatically in one fell swoop. PADB came in at 16x the price/performance of Oracle, the prior leader (and only other vendor willing to tackle the 30Tb benchmark to date.) PADB, running on Sun Opteron 2356 servers, Sun Fire™ X4540 storage servers and OpenSolaris™, was 7x faster on queries and 4.6x faster loading the data than the 2 year old Oracle result. And because of its architecture, the construction and tuning of indexes and partitioning strategies were not needed. TPC rules are specific about having product in GA within 90 days, so one can expect to see PADB version 2.0, on which the benchmark was based, out in Q3.

ParAccel has seen some skepticism in the analyst community because of its relatively small published number of customers. It claims a dozen, and half are listed on its web site. Other vendors, like Vertica and Greenplum, have been very forthcoming promoting theirs, but both have more time in the market. PADB was released in Q4 2007 and really began its arc in 2008; Vertica has a year head start, and Greenplum even more. Rumors have also floated about whether CTO and founder Barry Zane was leaving. I had a conversation with Barry in late June to discuss the business and the benchmarks. He was clearly excited about the benchmarks, in which he was very involved, even working on the full disclosure report personally  – “It got to be like a hobby for me,” he said – and he was quite clear that he is not going anywhere.

There is suddenly a sizable number of new entrants in the analytic database space; and (disclosure) some are clients. (Not ParAccel at this writing.)  I’ve posted about Greenplum, Vertica and Aster. Like several of the others, ParAccel has some roots in Postgres and is a massively parallel, shared-nothing architecture; like Vertica  it uses column-based storage. Like the whole group of new players, it is routinely winning proof of concept engagements (POCs) against traditional DBMS players.

Structurally, ParAccel’s offerings are available in either proprietary or fully commodity component-based offerings. ParAccel’s description is that PADB is “standard-server-based;”  the software may be purchased and run on a variety of commodity platforms. The Scalable Analytic Appliance (SAA) offering uses “enterprise-class midrange SAN components from EMC.” SAA uses a gigabit ethernet interconnect and 4 processors at each node with dedicated storage. In either case, a leader node (the Postgres-derived code is found here), coordinates the activities of the compute nodes. A hot standby node is always part of the installation, and can step in for any failing node, including the Leader node.

The EMC partnership allows ParAccel to rely on a FibreChannel-connected SAN (in a modular, midrange form factor designed to scale along with servers) for its enterprise-class features.  Availability, backup and rapid recovery, and data replication thus become easier for ParAccel to deliver – as long as EMC’s CLARiiON CX4 is in the picture. PADB is able to concurrently scan the server-based direct attached storage (DAS) and the SAN.  This “blended scan,” ParAccel argues, gives it the best of both worlds.

ParAccel’s technical papers do a nice job describing how “Continuous Sequential Scan Rate” (CSSR), measured in megabytes of I/O per second, describes throughput from platter to server and helps demonstrate the power of the new architectures. PADB boasts a patent-pending query optimizer, notable for its ability to handle correlated subqueries (CSQs), which feature in several of the TPC-H benchmark queries, and are often a performance stumbling block. Without belaboring the techie talk here, removing columns from CSQs can have substantial impact just as it does for table scans. Retrieving relevant columns also improves CSSR substantially. Columnar storage aids data compression substantially as well, and in combination with the benefits realized from not having to use substantial amounts of space for indexing, the growth rate of installed storage relative to raw data is improved considerably.

ParAccel claims wins over competitors such as Sybase IQ, Netezza, and Vertica as well as Oracle, and touts real-world performance numbers far better than the benchmark. Of the new architecture competitors it faces, only Sybase IQ has stepped up to the TPC-H bar to date, and no doubt there will be many win and loss claims from all the vendors over the year ahead. But PADB has vaulted into contention with this announcement, and will no doubt be on more short lists – as it should. ParAccel will also begin to see more attention from its partners, including hardware players beyond EMC: Dell, Fujitsu-Siemens, Intel, AMD and others. Sun, who made a substantial contribution to the benchmark by making much of the hardware available, may be somewhat less aggressive in the wake of its acquisition by Oracle. But ParAccel says Sun is not its most installed platform; customers are running on HP, Dell and IBM hardware already. Software partners are also likely to be friendlier as the temperature rises.

ParAccel is fortunate to have completed this work with Sun, whose acquisition by Oracle will no doubt create some reallocation of resources and priorities. This is a coup for ParAccel, whose timing turns out to be impeccable. As always, prospects should insist on a POC. And as Mark Madsen of Third Nature (his blog is here) says – “always hold back some queries;” you want to see how any database performs without heroic tuning, unless you plan to keep an army of specialists around.

Published by Merv Adrian

Independent information technology market analyst and consultant, 40 years of industry experience, covering software in and around the data management space.

27 thoughts on “ParAccel Rocks the TPC-H – Will See Added Momentum

  1. Merv,

    I’d be highly skeptical about a lot of that.

    Managing 30 TB quickly with 961 TB of disk and 2 1/2 TB of RAM isn’t much of an accomplishment, especially if your compression is good enough to fit a whole lot of data into RAM. ParAccel’s prior TPC-H benchmarks ran “entirely” in RAM, and until there’s proof otherwise, I’d conjecture the same about this one.

    Based on the lack of interest in Kickfire’s TPC-H result, ParAccel’s prior TPC-H results, and so on, I think customers are for the most part wise enough to recognize the irrelevance of TPC-Hs. I wouldn’t assume these will be a significant market momentum booster.

    Last I heard, ParAccel bitterly resisted on-site POCs, suggesting “heroic tuning” is at the center of all its good results. I’d categorically advise against including ParAccel on a short list unless the company confirms it is willing to do a POC at the prospect’s location, or otherwise under circumstances that preclude “heroic tuning” by Barry Zane personally.

    Based on past experience, I’d be very skeptical of ParAccel’s competitive claims, even more than I would be of most other vendors’.

    1. Fair enough – I respect the data you bring to the table, and you have talked to some customers or prospects I likely have not met. I agree with your guidance – and it’s not that different from what I suggested in the last sentence or two of the post. POCs are de rigeur as far as I’m concerned, and the witholding of some queries is specifically a measure to see the impact of that.
      I’m not as sure that the absence of TPC-Hs is due to the skepticism of buyers. Vendors have to stand up a lot of hardware (which you note) and spend engineering time to do them, so they are costly from several perspectives. And to do so if you don’t expect to do well is arrant foolishness. Some of the queries in the benchmark, as I pointed out, are problematic for some of the competing products – who are equally capable of using disk and memory in the same way ParAccel did. It will be interesting to see if they choose to do so!

  2. Hi Merv,

    Thanks for covering our TPC-H results. We were very pleased with how they came out.

    Thanks also for noting that we have a strong percentage of our customers referenced on our website. While it can be tempting for a start-up to inflate its customer count, we maintain a rigid discipline about how we represent our accomplishments – with our customer list, we include only paying, reference-able, commercial customers. While this may lead to the appearance that we are at a much earlier stage than we would like, for us, this is simply an integrity issue.

    You are right to remind prospective buyers that they must engage in their own POCs and, importantly, to reserve ad hoc queries (the tougher the better) for their own testing. It is with these tough, unexpected queries that we shine and have the opportunity to demonstrate the ease with which we provide breakthrough performance.

    Accordingly, all but one of our customer wins were achieved in competitive bakeoffs that included on-site POCs, and our overall competitive track record remains a source of great pride. Our performance in customer bake-offs has never been beaten, though we were tied once.

    We performed the TPC-H benchmark in a true ‘load and go’ manner, without indices or tuning, as shown in the full disclosure report. One can also see from the full disclosure report that this was not an in-memory run though we do use memory extremely efficiently, which comes through in the numbers.

    Finally, while we don’t mean to over-represent the significance of TPC-H, it is an industry-standard, multi-vendor sponsored, independently-audited (with certified auditors) DSS workload that is widely known and well documented. It is the most credible general benchmark to-date, and, at the very least, measures both loading and querying of the same tables. This is more than can be said about some of the metrics touted by others.

    1. Thanks for the comments, Kim. I can only add that anyone who doubts that you can handle their workloads should invite you to prove it. And may the best solution win.

  3. @Kim: “It is the most credible general benchmark to-date”

    Am curious which other benchmarks don’t make the cut in your opinion. Besides the OSS ones, what else is there out there?


  4. The following was posted on Curt Monash’s site earlier today. I’m reposting it here because it is a detailed reply to an opinion expressed by Monash in Response 1 above.


    Perusing your website, I detect a certain hostility towards ParAccel.

    I have no quarrels with that; after all you may have personal reasons which I’m not aware of.

    However when you make outrageous statements like

    “Managing 30 TB quickly with 961 TB of disk and 2 1/2 TB of RAM isn’t much of an accomplishment, especially if your compression is good enough to fit a whole lot of data into RAM”

    you display a profound ignorance about TPC-H in general and ParAccel in particular, that just makes you look foolish.

    Indeed you do more to harm your own credibility than raise doubts about ParAccel.

    Again from perusing your website, I see you’re not a great fan of TPC-H. But, for all its flaws, TPC-H is the only industry standard, objective, benchmark that attempts to measure the performance, and price-performance, of combined hardware and software solutions for data warehousing.

    You may dismiss it, but more than a dozen hardware and software vendors take it very seriously. And contrary to you assertion, many customers do factor TPC-H results into short list decisions.

    It’s true that the rate of submissions over the past 2 years have slowed down from that of previous years, but I think that is a consequence of the much higher level of performance that the newer results are achieving. This is primarily due to the innovations that systems like ParAccel are introducing.

    As for your speculations about the difficulty of ParAccel’s accomplishment, the following are some details that would be useful for you to understand, so you don’t say silly things in the future.

    By way of full disclosure, I was deeply involved in all aspects of producing ParAccel’s result. So obviously I’m somewhat biased. However, I should also point out that over the past 7 years, I have personally run, hundreds TPC-H tests on numerous servers, using several different DBMS products. More than a dozen of the currently listed TPC-H results are due to either my sole effort, or to joint efforts with colleagues. Hence it’s safe to assume that I know something about TPC-H in general and about ParAccel in particular.

    First let’s look at your fantasies about compression. You want your readers to believe that aggressive compression enabled ParAccel to store the entire 30TB (raw ASCII data) in 2.5 TB of main memory. That simply did not, and could not, have happened!

    In order to accomplish what you’re suggesting, a compression factor of more than 100 to 1 would be required.

    To see why, consider the following.

    Of the 64 GB of memory per node (there were 43 nodes for a total about 2.7 TB) about 8 GB was used for the OS and the ParAccel text and (non-shared) data segments. About 50 GB of shared memory was used for (1) storing query processing intermediate results e.g., the workspaces required for sorting and processing aggregations and the hash tables used for hash joins and (2) the memory required for versioning (ParAccel uses a multi-versioning concurrency control protocol). So at best, you could argue that there could be 6GB of memory per node (or about .3 TB in total) that could be made available for caching data. The other 58 GB per node, or about 2.4 TB in total, was used for other purposes.

    Now if you want to compress 30TB down to .3 TB you’re going to need a compression factor of 100 to 1. But such a compression factor, for TPC-H data, is more than two orders of magnitude beyond anything possible.

    Yes, you can cook up highly artificial data, with sufficient redundancy, so as to achieve a 100 to 1 compression factor. But TPCH data, which is also artificial, does not possess anywhere near the amount of redundancy required for a 100 to 1 compression factor. Indeed the best compression I’ve ever seen for TPC-H data is on the order of 5 to 1. This was from a database vendor (which shall remain nameless) who, to the best of my knowledge, has done more with compression than any other vendor. Interestingly, 5 to 1 is very close to the theoretical limit for TPCH data, based on my own Information-Theoretic computations using Shannon’s entropy. So to cache the entire 30TB with the most aggressively compressed product would require 6TB on top of the already accounted for 2.4 TB.

    Since ParAccel’s TPCH compression is more on the order of 3 to 1, ParAccel would require more than 12 TB for a fully cached database — a luxury the benchmark configuration did not possess.

    Moreover, assuming a best-case compression factor of 5 to 1, no vendor could possibly fully cache a 30TB database with less than 8.4TB. So let’s dispense with the idea that ParAccel achieved its magic by caching the entire, or even a significant portion, of the database in memory. No, ParAccel used virtually all of the 2.7 TB of available memory for necessary overheads and not to cache TPCH data.

    This brings us to your next conjecture:

    Why did ParAccel require 961 TB of disk space?

    The answer is they didn’t.

    They really only needed about 20TB (10TB compressed times 2 for mirroring).

    But the servers they used, which allowed for the superb price-performance they achieved, came standard with 961 TB of storage; there simply was no way to configure less storage.

    These Sun servers, SunFire X4540s, are like no other servers on the market. They combine (1) reasonable processing power (two Quad Core 2.3 GHz AMD Opteron processors) (2) large memory (64 GB), (3) very high storage capacity (24 or 48 TB based on 48 x 500GB or 1TB SATA disks) and 4) exceptional I/O bandwidth (2 – 3 GB/sec depending upon whether you’re working near the outer or inner cylinders) all in a small (4 RU), low cost (~$40K), power efficient (less than 1000 watts) package.

    What any DBMS needs to run a disk-based TPC-H benchmark is high I/O throughput. The only economical way to achieve high disk throughput today is with a large number of spindles. But the spindles that ship with today’s storage are much larger than what they were, say, 5 years ago. So any benchmark, or application, requiring high disk throughput is going to waste a lot of capacity. This will change over the next few years as solid-state disks become larger, cheaper and more widely used. But for now, wasted storage is the reality, and should not be viewed in a negative light.

    Finally, I need to point out that your allegations of “heroic tuning” by Barry Zane are complete fabrications — at least for TPC-H. There were of course a few ParAccel configuration parameters that needed to be set properly for the given hardware configuration, but outside of that, there was virtually no tuning at the ParAccel level.

    However, since this was the first time a major ParAccel benchmark was attempted on hardware running Solaris (actually OpenSolaris), there were a number of
    Solaris-based optimizations that were inserted into the code, as well as some Solaris tunings that required experimentation. These additions to the product significantly improved the TPC-H performance. But these were one-time improvements that are now built into the Solaris-ParAccel product, so that anyone who runs ParAccel in the Solaris environment will automatically take advantage of them.

    So no, ParAccel’s outstanding result was not due to running in memory, using 961 TB of storage capacity or Barry Zane heroics.

    The real reasons are quite simple

    (1) ParAccel has implemented algorithms, within the context of a highly scalable architecture that make it the highest performing data warehousing product on the market
    (2) the low cost, high I/O throughput SunFire X4540 is the perfect companion to ParAccel’s software
    (3) Sun’s collaborations with ParAccel, introducing Solaris-based optimizations, into the product made a great product even better

    So Curt, pray tell, if ParAccel’s 30 TB result wasn’t “much of an accomplishment”, how is it that no other vendor has published anything even remotely close?

    1. Thanks for the longest comment I’ve ever had on my blog, and I hope you cut down on the caffeine. But seriously, thank you – really – for some deep detail about the benchmark from the hardware side.
      Tuning DBMS code lines for OS and HW capabilities is an honorable one – of course, it helps benchmark performance, but it’s also generally useful to people who want to run that DBMS on that hardware and that OS. Chasing benchmark success did a lot to develop DBMS performance through the early years of RDBMS. And I believe we are at the beginning of a cycle of similar leapfrogging as the new analytic DBMSs compete. If the TPC-H is not yet the best benchmark it could be, hopefully it will evolve – as the TPC-A did – as a continuing contest between testers and contestants.

  5. 100:1 is “more than two orders of magnitude” better than the best compression possible? You obviosly made a typo there.

    Anyhow, thanks for the clarification on RAM use.

    I’m highly suspicious of bemchmark-specific R&D. But as Merv points out, that kind of thing is not ALWAYS valueless.

    As for the disk — clearly, there are also some commercial implementations where, for better MPP performance, enterprises keep their disks pretty empty. But 96% empty disks? That’s pretty unrealistic even for those cases. If an enterprise has a 30 TB data warehouse, I doubt ParAccel or many other vendors would recommend a 43 node SATA system.

  6. Merv;

    Your comment:

    Chasing benchmark success did a lot to develop DBMS performance through the early years of RDBMS. And I believe we are at the beginning of a cycle of similar leapfrogging as the new analytic DBMSs compete.

    could not be more correct.

    High performance benchmarking is somewhat analogous to Formula One racing. Car companies spend millions of dollars supporting their Formula One teams driving cars that the public will never buy. But many of the lessons learned from the race track filter back into the cars we eventually wind up buying.

    Competitive benchmarks, like competitive auto racing, do indeed drive innovation.

    1. I want as many visitors as I can get – that’s one of the measures of success for me. But for the reader who wants to follow a linear thread in a coherent fashion, it certainly makes sense. I’ll see you there.

  7. @Richard – “But many of the lessons learned from the race track filter back into the cars we eventually wind up buying.”

    This I’d love to have examples of! The testing they typically do that _does_ end up in the retail models are done on internal tracks for secrecy purposes (I was at the Benz one in Stuttgart I can assure you getting in there is harder than the WH ) — F1 is all marketing budgets.

  8. So what is all about, guys? Why Curt went so offensive against ParAccel? Is it just about money? ParAccel is not paying him now while others do? Or this is something else?
    It has to be personal when seeing mostly negative side presented by Curt. By the way, it damages his credibility as independent analyst.
    I don’t ask this question on Curt’s site as interested in others opinion as well.

    1. I wouldn’t presume to speak for him. But Curt is passionate about the TPC in general, and not just ParAccel. For more, you should read his blog on the subject (as it appears you have) and ask him.

    1. Thanks, Amrith. Interesting discussion and I recommend it to anyone who gets to this comment. First time I’ve had 25 comments on one post.

Leave a Reply

%d bloggers like this: