Talend, a California-based open source data integration vendor with a development center in China, first shipped product in late 2006, and two and a half years later has established a strong, growing business as more and more firms attempt to build a relatively complete stack of open source data management software. With a recent $12M round of financing, Talend continues to build out its commercial infrastructure, and can be expected to raise its profile and continue its growth in a conservative market that nonetheless is aggressively pursuing information management technologies. Open source, tight economics, prohibitively expensive licensing models based on data volume, and huge maintenance costs are transforming buyers’ thinking about these products and opening the door for Talend and others.
Talend claims 900,000 “core product” downloads have yielded 250,000 active (i.e. registered) users. And from there to over 500 paying customers in less than two years makes a good story – especially when Talend assert that a third comes from the Fortune 1000. An Eclipse-based product upgrade mechanism makes routine registration well worth it, and no doubt helps account for the relatively high download-to-registered-user ratio. As users move up to a full, priced relationship, they get enterprise capabilities such as multi-user support and load balancing, tech support, etc. The products offer graphical, business-oriented data modeling, data profiling, metadata discovery, connectivity to most widely-used systems and data sources – including SAS and SAP, cleansing capabilities, scheduling and more.
A surprise for me was Talend’s assertion that half of its go-live projects – and a big piece of its differentiation – are in operational data integration (ODI) used for application upgrades, data migration and replication; [added: the other 50% is considered BI by Talend.] Based on research from IDC, TDWI and others, Talend is convinced that ODI represents a great market opportunity, and there is good reason to believe they’re right. Its price advantage – no “data tax” based on volume, but rather a “number of active developers” pricing scheme – benefits greatly from this profile.
A paradox of ODI activities is that since they are less glamorous than BI-related projects, they are rarely staffed with visible, continuously employed specialists. Individual projects may be small and tactical, and often assigned to staff that don’t remain specialized in data migration, data quality, or other related disciplines. Skills and reusable methods and code are not as easy to find inside the organization. Enter open source, and a community model for collecting connectors and translation practices. Talend asserts that fully a third of its 400 connectors originated in the community – and are freely shared.
Partnerships are crucial in an integration-focused play, and Talend boasts a marquee list that includes open source stalwarts like Jaspersoft (who OEM the product), big brand partners such as Microsoft and Teradata, and system integrators such as Capgemini and Unisys.
Talend has formidable competitors – IBM and Oracle top the list. Fortunately, both have formidable prices too, and complex, massive offerings. Talend has been getting in under the radar a lot, and is likely to continue to until and unless the big firms start to rethink pricing. Nothing new here – the same model is being seen across the software industry as open source gathers credibility and momentum. The timing is perfect: spending constraints are tough, software asset management efforts are showing how much software is unused, and old-style licensing models are forcing companies to pay massive amounts of “maintenance” money for rarely used, rarely updated software. Talend is one of the leaders of the new wave, but they are not alone, and they will continue to benefit from the industry transformation they are helping to drive.
[Quick add: I just came across a nice piece published by James Governor in January, which includes some added nuances about the use of the community for localization, about the investors helping to fund Talend, and some European distribution and potential market prospects. Recommended reading.]
37 thoughts on “Talend Uses Open Source and Community to Transform Data Integration”
Since you are covering Talend, may be you can answer the question. What Talend really does about data replication or as they also claim “change data capture”? As I know there are few players in the market who read database logs and propagate the transactions. Did Talend implemented similar stuff? What they are talking about?
Good question, and I have not dug deeply into this. Talend announced CDC about a year ago here: http://www.talend.com/press/Talend-Announces-First-Open-Source-Product-with-Change-Data-Capture-for-Real-Time-Data-Integration.php
But CDC is notoriously tricky if you want good performance wihtout degrading the source’s performance in the process. Goldengate has done well competing in this space; Sybase Replication Server, and offerings from IBM, Oracle and others all have their advocates. As always, the best way to get something that meets you needs is to test it on your site, with your data, under your typical environmental conditions.
True. I have heard their webminar where Talend sales rep said they have a real time change data capture. Unfortunately did not find database log scraping in Talend when evaluated them. The generated java code seems like a convenient at a first sight and might affordable. However the performance is sucks when add some load and try to do the replication from Oracle to MYSQL. Talend probably uses it as a marketing gig. My team has looked into couple of high-end solutions that supposed to work well and now all comes to the cost. The IBM’s Data-Mirror and Goldengate are way too expensive. Don’t know who can afford them these days. The Sybase product is priced more reasonable, the same with WisdomForce product.
If you looked at Sybase a while ago, check back in – they have done some work on their Oracle replication technology; it’s the first real major upgrade in some time and they claim it has considerably improved as well as adding support for more recent Oracle releases. I suspect Talend’s will improve over time; it is themost widely installed database in the world, after all, and I would imagine they will work on improving that capability.
Sorry for the sidetrack; I’ve heard Oracle is rumored to be looking at GoldenGate — anyone have any insight into this?
@Merv: thanks for the post. If I may, just one minor correction: it’s 50% of our projects that are related to BI/analytics, not 20%. The other half is indeed Operational Data Integration.
@Ryan: Changed Data Capture is one of these features usually associated with complex, enterprise projects, and it’s not part of our GPL product – only the enterprise version. So no surprise you did not find it. IF you’d like a custom presentation of how it works, please PM me (yvesm at talend dot com) telling me where you are based and I’ll put you in touch with the right person.
I’ll adjust the doc, Yves, and apologies for misunderstanding. My notes say that 20% was BI, 30% wwas considered “migration” and 50% ODI. Did I misinterpret that they were exclusive, or that migration is analytics?
I probably threw too many numbers at you… data migration is definitely part of the OpDI market, and it’s about a 50/50 split between OpDI and BI for us (I don’t have more granular details on which part data migration represents, sorry).
On the other hand, we do a lot of conversion from proprietary vendors to open source and I may have used the term “migration” for that during our discussion. Apologies for confusing you!
Hi Merv – am I reading this wrong, or is ODI a close fit to EAI? If so, it’s not surprising that Talend would have a big price advantage, since I’ve been hearing complaints about high EAI costs in the hundreds of thousands since the early 2000s. Otoh, if it’s closer to EII, the price advantages vs. what’s out there (from IBM WebSphere Federation Server or whatever it is these days, to Sybase/Avaki, to Attunity, to Red Hat/Metamatrix) compared to the added functionality in EII, don’t seem to present as strong a case.
Wayne, here’s a response from Yves:
EAI is a pretty generic term that encompasses many disciplines, and OpDI is one of them – it’s one of the ways to get applications to exchange data. EII is another subset of EAI. OpDI and EII cover very different needs, since EII does not actually move data but provides federated views (lots of performance issues, no persistency).
This said, Wayne is right when he says that EAI deployments are traditionally very expensive – but so are traditional ETL deployments. In all cases, open source decreases the bill and the risks of the projects.
My take – I haven’t looked to see what the latest definitions of the categories are; I’m less interested in labels than use cases these days. Talend focuses on very specific ones involving the movement of data from one place to another, and not ongoing interapplication activity, which seems to me to be what EAI often is. I can hardly disagree on the cost question, and for me as an outside observer the issue of risk is for the moment still a bit up in the air, though that is more about whether I have done much due diligence on the issue. (I have not.)
In regard to CDC. From what I read on their user forums, CDC is based on DBMS triggers and doesn’t use log scrapping. I’d expect that this could seriously impact dbms performance of a source system. In any case, I’ve downloaded it and am having a deeper look. We’re in teh DW business and resell one of those ‘expensive’ ETL soultions so it’s nice to see what the open source version looks like.
BTW: Does anyone know if TALEND is available as a preconfigured linux VMWARE image?
@Tim: if you listen to DBAs, both CDC approaches (triggers and logs) have an impact on performance, depending on how they are configured. Question is, how much. Users who have done performance testing of Talend’s trigger-based CDC found that the performance impact was minimal (a few percent). In many cases, logging is not activated on a production database and enabling it can create a big performance hit.
This said, if you are interested in log scrapping CDC, Talend has a partnership with Attunity, the leader in that field: http://www.talend.com/press/Attunity-and-Talend-Form-Partnership-to-Provide-an-Enhanced-Integration-Solution-for-Customers.php
Interesting discussion. If ask an experienced DBA managing transactional database about the triggers, the feedback will be very unpleasant. Triggers simply do not scale and cause many problems, especially in production environment. I guess this is a reason for Attunity/Goldengate/Wisdomforce/Sybase are able to sell their logs scrapping products. Although it looks like Attunity is having financial troubles despite of OEM with SSIS.
Interesting discussion indeed. For fair presentation, please note that i am with Attunity. And to clarify what Todd mentioned, we are in no financial trouble whatsoever. The opposite is true, as we are profitable and growing.
We have been dealing with CDC since 2003 and have quite a lot of experience working with different databases. As Todd highlights, indeed most DBAs perceive trigger-based CDC as intrusive and rightfully so. Beyond the resource overhead, customers have also reported other issues including impact on database transactions (time, congestion, and potential exposure to errors) as well as increased maintenance overhead.
Log-based CDC is a mature technology. And integrating it with ETL tools like Talend, as Yves mentioned, enables to cut down ETL load times as well as run jobs on a more frequent basis, even near real-time.
Itamar, thanks for jumping in here – and apologies for not having come to you to ask about the assertion Todd made about your finances. I lost track of the comment thread on this post for a couple of weeks while I worked on some other projects. Hopefully we can check in sometime soon and catch up on Attunity’s status.
Hello, I wanted to comment the previous note been made about Attunity difficulties. It was based on the rumors about the financial difficulties and massive lay-offs. Attunity should discloses their financial details and everyone can review on Attunity website. You could probably see Attunity scaled down significantly in the last few years. Their stock in low pennies for a long time and got delisted from everywhere. Itamar might confirm or decline those facts (not the rumors) with more details.
Todd, Attunity does release financials, (unaudited) and discusses on a GAAP and non-GAAP basis where they are. Clearly there have been challenges. But they have revenue and partnerships that suggest the possibility of recovery as the economy turns up, although challenges certainly exist. I’ll be looking at them later in the year, with any luck.
I am convinced that the traditional vendor pricing of ETL / DI software is completely outdated and unsustainable. And it’s becoming clearer every day that Informatica and others are becoming really concerned about the new wave of data integration software vendors and products. And they should!
At the same time, I don’t think that Talend’s approach of not disclosing their full product price list to the public is a good practice either. Go to their website and try, for example, to buy their team edition online and you’ll be surprised to see what it says: “For information about subscription services in your region, inquiry now”. So much for “free” open source software!
Our software pricing is completely transparent and posted on our site at under http://www.expressor-software.com/list_prices.htm.
Similar to Talend, we recognize that we wouldn’t be able to build a viable company without charging customers for enterprise-class DI functionality. But that doesn’t mean that we can’t be fully transparent on what we charge for our products and services.
For further amusement on ETL/DI software pricing, check out my August blog at http://blog.expressor-software.com/etl/data-integration-software-pricing-gone-crazy/.
I’m sure we’ll hear from Talend about this, and transparent pricing is indeed a topic many people care about – although it’s hardly broadly spread across the software vendor community.
Michael, here some critics. You might want to revisit your marketing campaign. As they teach in business school it is a bad idea to use the price comparison as main differentiation in enterprise software unless you are open source vendor like Talend. Open source guys allowed doing it because it is natural for them: open source meaning free or very cheap and the business model to make money from the services. Besides, how many customers you got to claim your superiority over Talend or Informatica? Informatica is a leading DI vendor for a long time and has a good product in overall. People are ready to pay top dollar. Besides, the difference with your license cost is not that significant, when taking into the account the fact you have non-proven product. The last, posting links to expresor website in this specific blog post about Talend might improve your google rank but at same time it makes damage to the image of your company to consider seriously afterwards
At Talend, we are all for being open – after all, we make our source code available, our development process is transparent, our product roadmap also. Users can see how it works and what we are working on.
When it comes to pricing information, we are also transparent, but only with potential buyers. When someone calls with a need, we are not withholding price information until we have done a POC and locked them in (yes, it happens, believe me!). There is no surprise, the price is clear, and presented up front. The buyer knows exactly what (s)he is getting into.
Sorry to break it to you Michael, but this is a tough world out there. We fight for business against big, established players. And we are not going into the ring with our hands tied behind our back by telling them what they are up against – not unless they play by the same rules, which they don’t.
Choosing data integration software is complex. We have 3 editions, the data quality options, the massively parallel edition, the real-time edition… unlike Expressor, we don’t believe in one-size-fits-all software.
I am glad for Expressor that their software is so simple that their price fits on one line, and that their customers don’t have options to consider. It may help them in low end deals against Pervasive, and if it can contribute to broaden adoption of data integration, I am all for it.
But when you are serious about doing business in the enterprise, publishing your price doesn’t help anyone – except your competitors.
Yves, most of us are quite familiar with how to buy high-end automobiles these days. Their sticker prices are pretty clear — 90% of the options are included. Dealers don’t withhold the price until you have test-driven the car.
Choosing data integration software shouldn’t have to be any different. Your claim that you have to play the same game that the big players do seems as disingenuous as your claim that you are all for being open.
You are no more open than Informatica and IBM. And by the way, your open source model is equally disingenuous and not pure since you charge for “value-added” features.
Being a student of our industry, you know we don’t compete against Pervasive. We compete in the enterprise and our closest competitors are Ab Initio and Informatica.
Our customers like the fact that they can learn about our pricing even before they start talking to us or start a POC. And we think that publishing our price list definitely helps them and us and not our competition.
Michael, I realize you’re making a competitive point here, but pricing visibility is a nuanced issue, with many differing implementations. I think many of us would like to see all vendors publish all prices, but Yves makes a good point about not unnecessarily handcuffing yourself when the others don’t do so. And the variablity of versions, open and closed (most often seen in “open core” models) is not unusual either. I hope we can have this conversation without attacking one another.
Michael, I am glad for you that you can sell DI software like cars. I just hope that your pricing is more “real” than the one of car vendors. Who pays sticker price for a car nowadays?
As they say, don’t feed the trolls… so I won’t. If you want to take on the commercial open source model, that has proven highly successful for companies such as MySQL, SugarCRM, Jaspersoft, Alfresco, Pentaho, Talend, and many others – be my guest. Users know where the value is.
Such an arrogance could be a bad thing for a young company that just starts to discover itself
I’m not intimate enough with OSS pricing models to jump in here but it does occur to me that while OSS and proprietary vendors beat each other up about who’s pricing is more open and/or less complicated to figure out, the big winners on that front are SaaS on-demand players with a monthly per seat costing structure – At least those are straightforward and easy to calculate.
A lot of vendors play a game where they publish a list price but when it comes to talking to customers they do some haggling, some feature bundling, some training and consulting add ons and come up with a custom price. Marquee customers can often get very deep discounts. So my question to Talend – why don’t you just show a high list price and then haggle with customers to make it more attractive? My question to Expressor – do you stick to your list price on all your sales? Are you as transparent as you make out to be?
Vincent, in regards to your question above, we absolutely stick to our list price on our sales.
We haven’t encountered situations yet where a customer felt that our software was overpriced for the value we provide. Usually, the opposite is the case. But I am also open to admit that we have discounted our software in a few situations to close a deal within a given quarter.
On your second point, we do not add product features or bundle services to “sweeten” a deal. Given our usage-based pricing approach, we have chosen to provide all product features including GUI tools, our semantic metadata repository, our parallel data processing engine, and all our connectors as part of a sale. There are no restrictions on the number of connectors or GUI tools a customer can use.
Vincent, if you have links to the price lists of Informatica, Ab Initio, SAP, IBM and Oracle I am very interested… Even if that’s the pre-discount prices!
Showing a higher-than-reality list price and “haggling” with each client would be exactly what we want to avoid – price opacity, customer-specific pricing, etc.
Vincent: good point although majority of customers are wise these days. They are listening to guys like Ray Wang.
By the way, did anyone paid attention to what Michael and Yves just did? 🙂
Oracle has a global price book and has the one price for its products across every country. This means Oracle products have become very cheap in Australia with the rising dollar! I was so surprised that Oracle offered ODI and Informatica for the same price that I blogged about it: http://it.toolbox.com/blogs/infosphere/oracle-bi-gambit-cutting-the-price-of-informatica-by-over-70-30160
IBM has a button on most product pages “view prices and buy”, when I click on the button on the DataStage page it takes me to a list of prices of about 24 DataStage products and addons. I used IBM price books to create the post on the cheapest and most expensive places to buy DataStage: http://it.toolbox.com/blogs/infosphere/25184.
You still got to take that DataStage per value unit price and multiply it by your processor/core type which can be complex. Plus different rates for prod and non-prod (and other vendors who don’t charge for non-prod). And some vendors charge for database sources/targets. Some charge for ERP addons. It gets complex. You would never pay list price for IBM software, always haggle.
I don’t think Informatica publishes any prices other than the cloud services, in fact if you google “informatica price” my blog post on the Oracle pricing of Informatica comes up first.
As for Ab Initio – you cannot really compare them to any other vendor. They live in their own world.
Nice discussion, Vincent. The pricing approaches of software vendors have helped keep procurement specialists, lawyers, accountants and consultants employed for years. One interesting shortcut when available is to find out the GSA listing for US government pricing and insist on doing at least that well. For DBMSs that have done TPC benchmarks, prices used in the price/performance computations are required to be available to customers as well, or the benchmark must be withdrawn, There are clues everywhere…