Aster Data has announced its new version, nCluster 4.6, which now includes a column data store, staking a claim as the first ADBMS to combine SQL and MapReduce on a hybrid row and column MPP system. While its R&D has hitherto been focused on enabling advanced in-database analytic processing in its flagship “Data-Analytics Server, ” Aster has clearly had other irons in the fire. CTO Tasso Argyros tells me that the new column store is entirely new, written from scratch to ensure that Aster’s SQL-MR is a universal programming layer atop storage, and that its 1000+ MapReduce-ready analytic functions (and UDFs) will run on both row- and column-based data.
Choice of storage, which in this release can be implemented per table or per partition, provides customers flexible performance optimization based on the nature of their expected analytical workloads, as determined by the new Data Model Express tool Aster provides that assesses sample queries against the data. The rules-based recommendation engine can be re-run as the mix changes.
Aster’s dynamic workload management can be used to set policies against both stores as well, allowing appropriate flexibility in governing how they are used. Aster’s other features – fault tolerance, compression, indexing, automatic partitioning, SQL-MapReduce, and its prepackaged frameworks for statistical, graph, time series and partner-built functions – will all work across both stores, transparent to the users. The release also adds functions for Symbolic Aggregation approXimation (SAX), decision trees, and histograms. SAX is useful for reducing the dimensionality of time series and…well, follow the link. If you’re into that kind of work and don’t know of it, you’ll be interested.
Aster sees demand for a single platform for different types of applications; it believes customers want to put all their data in one place. If you’re following developments in analytical processing, the argument is familiar: in-database processing eliminates the latency inherent in shipping data across the network to a processing tier.
Aster describes a customer scenario where a behavioral analysis is run to determine purchase indicators against purchase data in a row store, with the derived information being stored in a column store for ad hoc analysis. This notion of subsuming multiple stores “within” the database is a response to the fact that, as Aster sees it, 70% of customer data is not living in the EDW; Aster Data says its customers are looking for a place to aggregate it and perform complex analytics.
Time will tell whether hybrid multistores prove more effective than keeping data in several places, including “outside” the ADBMS in file systems and running programs against it there as many shops are doing today. Aster is placing its bet and testing the waters now. Few early experiences exist yet, but they will begin to show whether one model is preferable to the other as they roll out.
This version of nCluster, with such a major structural addition, arguably should have been labeled as a major release and called 5.0. It’s refreshing to see a vendor err on the side of too little hype rather than too much.
Disclosures: Aster Data is not a client of IT Market Strategy, but is a sponsor of a multi-vendor research study we are co-authoring