Hadoop Summit Recap Part Two – SELECT FROM hdfs WHERE bigdatavendor USING SQL

Probably the most widespread, and commercially imminent, theme at the Summit was “SQL on Hadoop.” Since last year, many offerings have been touted, debated, and some have even shipped. In this post, I offer a brief look at where things stood at the Summit and how we got there. To net it out: offerings today range from the not-even-submitted to GA – if you’re interested, a bit of familiarity will help. Even more useful: patience.

–more–

Calpont’s InfiniDB – Another ADBMS Insurgent Arises

Calpont, rapidly emerging as yet another contender in the ADBMS sweepstakes, has announced version 2.0 of InfiniDB, its columnar MPP offering over shared storage. The value proposition hits now-familiar themes: high-performance query, fast data loading, data compression, and parallelized user defined functions (UDFs), all of which are becoming key checkoff capabilities. InfiniDB also hits hard on pricing, which it says dramatically undercuts that of its competitors. And a 30-day free trial of the enterprise edition sweetens the offer. For those comfortable with open source, the 2.0 release of the  community edition is available as well. Calpont says the community edition (which is limited to a single server but is otherwise database feature-complete) has had 15,000 downloads. But the company’s relationship with Oracle for its MySQL components must be considered a risk going forward.

InfiniDB, like Infobright, is built atop Oracle’s MySQL. (I posted about Infobright last year, and it also has made significant progress, drawing favorable comment in the open source community for its continuing maturation.)  Calpont’s relationship with Oracle must be seen as a risk factor..Oracle’s recent decisions about support raise questions about its interest in supporting anyone who is not an enterprise-class user of the Oracle-branded MySQL offering. Calpont has a deal through 2012 that includes an OEM license to integrate and use MySQL as the InfiniDB branded solution, and access to the MySQL channel. What will happen beyond that is clearly a concern. Read more of this post

Aster Data Adds Columnar Storage, Puts Stake in Ground for Hybrid Multistores

Aster Data has announced its new version, nCluster 4.6, which now includes a column data store, staking a claim as the first ADBMS to combine SQL and MapReduce on a hybrid row and column MPP system. While its R&D has hitherto been focused on enabling advanced in-database analytic processing in its flagship “Data-Analytics Server, ” Aster has clearly had other irons in the fire. CTO Tasso Argyros tells me that the new column store is entirely new, written from scratch to ensure that Aster’s SQL-MR is a universal programming layer atop storage, and that its 1000+ MapReduce-ready analytic functions (and UDFs) will run on both row- and column-based data. Read more of this post

RainStor Adds Funding, Investors, Readies Nearline Archive Rampup

RainStor, a firm I discussed as Clearpace in a June 2009 post, had some very good news this week.  $7.5 million in Series B funding came in from Informatica, Storm Ventures and its previous investors Doughty Hanson Technology Ventures and The Dow Chemical Company. RainStor plans to “use the funding to expand into new markets, grow its partner base, and invest in product development and R&D” says the press release. Read more of this post

Programmers: Pervasive’s Parallelization Provides Punch, Profit

After 27 years of steady growth, Austin, Texas-based Pervasive (PVSW) has become a $47M annual run rate software provider. Its portfolio includes a “zero admin, light footprint database” (the former BTrieve, now PervasiveSQL), data integration software (for SaaS and on premises applications), and data synchronization products for such apps as salesforce.com, Quickbooks and Microsoft Dynamics CRM. In 2009, it began leveraging its DataRush processing engine as a product, providing a solution for companies that want to take advantage of multicore architectures to drive dramatically enhanced performance on much smaller footprints, for programming data services tasks such as aggregation, de-duplication, cleansing, integration, matching and sorting, as well as data mining and predictive analytics. Read more of this post

SAND Technology Starts 2010 Well After Flat 2009

ADBMS vendor SAND Technology’s report on its 2009 fiscal year seemed to offer little reason to change my earlier skeptical position on the firm. Its 2009 revenue was essentially flat at $7 million (Canadian dollars throughout). Cost of sales, R&D, and SG&A – and the firm’s net loss – were also nearly unchanged. And yet, there are changes going on, and they are positive signs, especially for a year in which the IT market will rebound. Net income for SAND’s fiscal 2010 first quarter was $553,253 on revenues of $2,485,464 – a substantial turnaround from a net loss of $989,850 on revenues of $1,223,928 for fiscal Q1 2009. One quarter is not a trend, but it is a good sign. Read more of this post

Xkoto’s Database Virtualization Expands Cloud Opportunities

Xkoto, the database virtualization pioneer, has generated substantial interest since its first deployments in 2006. Still privately held and in investment mode, Xkoto sees profitability on the horizon, but offers no target date, and appears in no hurry. Its progress has been steady: in early 2008, a B round of financing led by GrandBanks Capital allowed a step up to 50 employees as the company crossed the 50 customer mark. 2008 also saw Xkoto adding support for Microsoft SQL Server to its IBM DB2 base. Charlie Ungashick, VP of marketing for Xkoto, says that 2009 has been going well, and the third quarter was quite strong. And at the end of September 2009, Xkoto announced GRIDSCALE version 5.1, which adds new cluster management capabilities to its active-active configuration model, as well as Amazon EC2 availability. Read more of this post

Oracle on Database: It’s On. And They’re Not Kidding.

Oracle is the company that led the industry into making RDBMS the data persistence vehicle of choice, and though its flagship is still Number One, many other topics floated around as 35,000 people attended Oracle Open World (OOW) in San Francisco recently. The spotlight stayed firmly planted: “What will Larry say about clouds/IBM/Fusion apps?”; Marc Benioff and Larry; Arnold and Larry. But if there’s anything Larry Ellison is passionate about, even as he sets his sights on IBM (hardware) and SAP (apps) – his two most important competitors, he said at the Churchill Club recently – it’s database, and he’s energized by the appliance opportunity. Andy Mendelsohn, SVP of Database Server Technologies put it simply in a conversation: “the only product Larry has spoken of in the last 3 earnings calls is Exadata.” He is more involved than in recent years, and that means one thing: everyone else had better watch out. What analysts learned about the new release makes that very clear: Oracle has been busy, and there is a lot of exciting new technology coming. Read more of this post

Illuminate May be Gaining Traction

I’ve talked about ADBMS vendor Illuminate in two posts already this year, and for a small firm with little North American footprint, they continue to drive a surprising number of questions I receive. I had a quick chat with Andrew Fletcher, the VP responsible for building out the partner network, and he’s upbeat about the firm’s progress. Read more of this post

Kickfire Disrupts DW Economics, Targets Mainstream ADBMS Opportunities

In just 18 months, Kickfire has established itself as one of the most intriguing of the ADBMS insurgents. It espouses a radical go-to-market strategy: target the overwhelming majority of the market in the sub-5Tb space, and let others battle over who’s doing best at the top end, fighting over a small group of prospects. Kickfire also takes a radically different architectural approach: it uses an “SQL chip” to run much of its work in hardware, to dramatic effect in performance.

In April 2008, the Kickfire data warehouse appliance was announced at a MySQL conference, and simultaneously the company released 100Gb and 300Gb TPC-H benchmarks  that transformed price-performance expectations at the low end of the market. 6 months later the appliance became generally available, and 6 months after that had its first production reference. Since then, the company has had two encouraging quarters, and the product is now in the hands of some two dozen early adopters, a half dozen of whom are referenceable production sites. I spent some time recently with Kickfire CEO Bruce Armstrong to discuss the story so far, and Kickfire’s recent announcement of Kickfire 1.5 and the 3000 series appliance.

Read more of this post

Follow

Get every new post delivered to your Inbox.

Join 16,633 other followers