Hadoop Distributions And Kids’ Soccer

The big players are moving in for a piece of the big data action.  IBM, EMC, and NetApp have stepped up their messaging, in part to prevent startup upstarts like Cloudera from cornering the Apache Hadoop distribution market. They are all elbowing one another to get closest to “pure Apache” while still “adding value.” Numerous other startups have emerged, with greater or lesser reliance on, and extensions or substitutions for, the core Apache distribution. Yahoo! has found a funding partner and spun its team out, forming a new firm called Hortonworks, whose claim to fame begins with an impressive roster responsible for most of the code in the core Hadoop projects. Think of the Doctor Seuss children’s book featuring that famous elephant, and you’ll understand the name.

While we’re talking about kids – ever watch young kids play soccer? Everyone surrounds the ball. It takes years to learn their position on the field and play accordingly. There are emerging alphas, a few stragglers on the sidelines hoping for a chance to play, community participants – and a clear need for governance. Tech markets can be like that, and with 1600 attendees packing late June’s Hadoop Summit event, all of those scenarios were playing out. Leaders, new entrants, and the big silents, like the absent Oracle and Microsoft.

more

More From The Low End: DynamoDB is the New Lucid

LucidDB (aka “the best database for BI you don’t know about”) has a commercial version on the way at last. Nick Goodman, a longtime user active in the Eigenbase and other related open source communities, has stepped in. Nick has a consulting practice that builds BI implementations (many using Lucid and Pentaho), and he’s now spun out a firm called Dynamo Business Intelligence to issue and support a product to be called DynamoDB. He often  found his BI clients asking what to use for a database – the default was MySQL, but he loves Lucid’s features and performance, and so it seemed like time. Nick’s blog can be found here.

Read more of this post

What’s An Eigenbase?

The open source community is remarkable in many ways. For me, one of the most significant aspects of it is exactly that: it IS a community. It’s composed of people who communicate and share in deep and productive ways. One of the most interesting manifestations of that spirit I’ve run across is the Eigenbase project, an extensible platform being used by some very creative folks for the creation and continuing development of databases for data warehousing (the LucidDB DBMS) and stream processing (the SQLstream continuous query engine). I haven’t posted about either of those yet but will, and I’m watching their continuing evolution with great interest.

Read more of this post