In a market suddenly awash with new analytic DBMS entrants, Aster Data Systems differentiates itself with an aggressive posture: in-database computations, MapReduce integration and commodity hardware. Like several other firms I’ve talked to recently, the San Carlos-based vendor has a Big Customer (mySpace), a Recent Launch (May 2008) and a Core Team of Hotshots with industry experience. They have been quick out of the gate, and boast 15 customers who are tackling “frontline data warehousing” for problems they could not solve any other way.
It’s been fascinating to watch new players garner early wins by asserting that general purpose databases are not optimal for analytics. Since Sybase IQ entered the market in the early 1990s, other specialists like Essbase and Redbrick have disappeared into larger firms, and their messages have largely disappeared from the marketing of companies like Oracle, IBM and Microsoft. They can be found if you hunt for them, but the fundamental point is not made: analytic problems are better served with specialty products– like OLAP, columnar databases and other architectural innovations.
Aster’s 3 founders – CEO Mayank Bawa, CTO Tasso Argyros and Chief Scientist George Candea – were all involved in Ph. D projects at Stanford. They got together to found a company after tackling some large scale problems for an emerging Valley startup or two and concluding that commodity hardware, in-database analytics, and a carefully interconnected multi-tier architecture would allow them to bring a cost-effective, differentiable product to market.
Some of the architectural pieces are unsurprising. Managing state at scale is the big problem, and the flagship nCluster 3.0’s ethernet interconnect (gigabit or 10-gig) has made it fairly efficient over a large number of commodity (and increasingly cheap) servers. Internally nCluster uses 3 tiers: dedicated loading nodes that bring data in and partition it; (redundant) working nodes perform reads and updates at the same time; and a “Queen” server group that plans queries. The tiers can be scaled independently and configured differently – for example, loaders can be almost diskless. A lot of power can be brought to bear – 800 cores over a 100-node cluster makes a big impression on most problems.
Software architecture inside the engine is a leading message. nCluster supports ANSI SQL as well as developers who want to write in-database. The idea is to wrap their work in SQL; Aster is extending the language with SQL MapReduce. The alternative, says Ajeet Singh, Director of Product Management, is to require a single script written by one person, who needs both sets of skills. Aster’s model promotes reusability; one programmer codes in java and the SQL programmers can all use it. A separate execution environment also provides process isolation; problems don’t affect other users.
In February, Aster announced the completion of additional funding that brought its total Series B financing to $17 million, led by JAFCO Ventures, and including return backers Sequoia Capital, Cambrian Ventures and First Round Capital. Thus armed, and with a few good references for its half-dozen or so sales teams to leverage, we can expect to hear much more from nCluster – and some new announcements will up the ante more in the months just ahead.