Infobright Bids to Anchor An Open Source DW Ecosystem
July 7, 2009 26 Comments
I recently sat down for a talk with Miriam Tuerk, CEO of Infobright – an open source, commodity hardware-based analytic database (ADBMS) vendor focused on the data warehousing market. Infobright is another of the leaders in the open source information management wave IT Market Strategy has been tracking. Founded in 2006, Infobright has assembled a remarkable team now committed to exploiting this economic model to reduce the startup costs of data warehousing. Like other open source players, MySQL-based Infobright has two versions: a Community Edition (ICE, whose community gathers at www.infobright.org) and an Enterprise Edition (IEE). This bifurcation allows it to distribute starter software broadly at minimal direct cost, then upsell; along the way, it gets to tap into the vibrant innovation provided by the user community that forms. As the product matures, such vendors fund the more hardened features large firms require by charging them for those added capabilities that they need. And now (July 7), Infobright has partnered with Jaspersoft for tighter integration with a report server and OLAP analysis.
Infobright is privately held, with roughly 50 employees (in Canada, the US and Europe, where a development team in Poland does much of the heavy lifting. ) Investors including Sun and Flybridge Capital Partners injected a reported $10M into Infobright in 2008; the company doesn’t discuss revenue, but considers funding “adequate through the end of 2010.” I expect that they will seek additional funding rounds as their infrastructure buildout continues.
Infobright moved into general release under a GPL license in September 2008. With three product releases under their belt, ICE boasts nearly 10,000 downloads and the company claims that 2,000 of them are active community participants. There are now over 60 paying customers in 7 countries. The new “integrated virtual machine download” announced today includes: ICE; the JasperServer Community Edition for report creation, delivery and scheduling; and JasperAnalysis, an OLAP server.
Infobright has implemented a column-oriented data store, deployed atop MySQL, as columns divided into 65,536 row elements known as Data Packs, which are compressed as they are stored – 10:1 compression was an early claim but the company says they frequently do far better. Statistics about the data (things like min/max, cardinality, etc.) are stored in a “Knowledge Grid” – essentially an indexing scheme, not unlike what vendors such as Illuminate use, that permits retrieval to be limited only to the data needed to resolve the specific query in question. Query tests in customer use cases routinely deliver sizable improvements in query times, as we have seen with other players in the new analytic DBMS space.
Infobright loads data quite rapidly on commodity hardware, and asserts that load speed will remain constant despite raw data size as a result of the architecture. The MySQL loader can be replaced with the Infobright loader in IEE to ensure high speed loads at scale. Infobright makes familiar assertions about “load and go”; certainly, with no careful designing of models, partitions and indexes, time to usage is significantly reduced. “Hardware setup and configuration can be done in a day,” company marketing asserts. Infobright offers several different claims to scalability, including “to 50 Tb and more,” inherits management tools and partnerships from its MySQL heritage, and also thus benefits from MySQL’s ability to run on Linux, Solaris and Windows, and work with Ruby on Rails, PERL, Python, etc.
The firm’s new CTO Bob Zurek, who joined in Q2 2009, is an example of the seriousness Infobright brings to their commitment to enterprise-class offerings. Bob was most recently CTO and VP of Products at EnterpriseDB, after a distinguished career that includes stints at IBM , Ascential, Sybase and Powersoft. Partnerships are playing a key role, and Zurek’s industry experience will no doubt have a big impact in working cooperatively with other OSS vendors and commercial ones. The company recently announced an “open source project for End-to-End business intelligence” with Jaspersoft’s BI tools, ETL (Talend-based) and Infobright’s DW at the MySQL conference. Shortly thereafter, it unveiled a hardware and software system for the deployment of BI with Pentaho, based on the Sun Fire X4275 storage server or the Sun Storage 7310 unified storage system. And today’s announcement adds the Jasper report server and JasperServer OLAP piece for yet another configuration.
But announcements are not enough. True integration needs to be shown if the company wants to move into mainstream shops that don’t want to do all the work themselves, and the degree of pre-integration is not clear just yet. To date, Infobright has signed up some 30 partners, and making all of the technology deals represent meaningful deliverables will take focus, experience, and some legwork to commnicate successes. But the funding is there, the experience is in place, and Infobright joins the battle with some strong assets. Download ICE and check it out – it’s worth a look.