2013 Data Resolution: Avoid Architectural Cul-de-Sacs

I had an inquiry today from a client using packaged software for a business system that is built on a proprietary, non-relational datastore (in this case an object-oriented DBMS.) They have an older version of the product – having “failed” with a recent upgrade attempt.

The client contacted me to ask about ways to integrate this OODBMS-based system with others in their environment. They said the vendor-provided utilities were not very good and hard to use, and the vendor has not given them any confidence it will improve. The few staff programmers who have learned enough internals have already built a number of one-off connections using multiple methods, and were looking for a more generalizable way to create a layer for other systems to use when they need data from the underlying database. They expect more such requests, and foresee chaos, challenges hiring and retaining people with the right skills, and cycles of increasing cost and operational complexity.
My reply: “you’re absolutely right.”

Amazon Redshift Disrupts DW Economics – But Nothing Comes Without Costs

At its first re:Invent conference in Late November, Amazon announced Redshift, a new managed service for data warehousing. Amazon also offered details and customer examples that made AWS’  steady inroads toward enterprise, mainstream application acceptance very visible.

Redshift is made available via MPP nodes of 2TB (XL) or 16TB (8XL), running Paraccel’s high-performance columnar, compressed DBMS, scaling to 100 8XL nodes, or 1.6PB of compressed data. XL nodes have 2 virtual cores, with 15GB of memory, while 8XL nodes have 16 virtual cores and 120 GB of memory and operate on 10Gigabit ethernet.

Reserved pricing (the more likely scenario, involving a commitment of 1 year or 3 years) is set at “under $1000 per TB per year” for a 3 year commitment, combining upfront and hourly charges. Continuous, automated backup for up to 100% of the provisioned storage is free. Amazon does not charge for data transfer into or out of the data clusters. Network connections, of course, are not free  - see Doug Henschen’s Information Week story for details.

This is a dramatic thrust in pricing, but it does not come without giving up some things.

More…

Diary of an Asian Swing: Day 3

This was a day of transition. No meetings in Hong Kong, so after a leisurely breakfast and a look at the news, I settled down for a rare session of uninterrupted writing. It was still Sunday back home, so the email was relatively caught up and I could focus. Finished first drafts of some Gartner Magic Quadrant DW DBMS content and sent them off to colleagues for review and assembly into our eventual document.

This MQ is my second, and I’m really enjoying the process this time now that I’m not trying to figure out what happens next. I’m especially pleased with the process of combining interview data from customer interviews and analysis of our inquiry traffic – hundreds for each of the four authors – with surveys we conducted specifically for the report.

Mark Beyer built a fantastic link for feeding survey criteria measured by numeric scores from customers directly into relevant cells on our underlying spreadsheet. We had already done some collective scoring of our own in those cells, and the new exercise showed us how customers read the same issues. And it moved some of the scores significantly, with some vendors doing better than we expected in some areas, and others getting hammered. When a sizable number of survey respondents highlight an issue like support as a serious weakness, one has to take notice.

Several hours of uninterrupted time, a luxury that made the work move quickly, gave way to a decision about what to do with a free afternoon. I decided to use it for more work, so instead of an excursion I headed to the airport hours ahead of schedule to work in the attractive Cathay Pacific lounge. But I was surprised by a helpful check-in agent who told me there was an earlier flight I could get onto. As a result, I arrived in spectacular Singapore late in the evening instead of well into the night, and was in my hotel for a good night’s rest before early morning meetings the next day.

And of course, working on the place – even without wi-fi – was just as good as working in the lounge. So I had the chance to complete a new draft of a forthcoming Hadoop Pilot Best Practices piece and send it off to a collaborator. A good day indeed.

Guest Post: Leading the Logical Data Warehouse Charge Has its Challenges

From my colleague Mark Beyer, who speculates about how leadership in moving toward the logical data warehouse (LDW) will be received: 

The logical data warehouse is already creating a stir in the traditional data warehouse market space. Less than 5% of clients with implemented warehouses that we speak with are pursuing three or more of the six aspects of a logical warehouse: 

  • repositories
  • data virtualization
  • distributed processes
  • active auditing and optimization
  • service level negotiation
  • ontological and taxonomic metadata

That means we are in a very early stage regarding the adoption trend, and vendors who are aggressively moving toward it are ahead of their customers.

..more…

Decoding BI Market Share Numbers – Play Sudoku With Analysts

In a recent post I discussed Oracle’s market share in BI, based on a press-published chart taken from IDC data – showing Oracle coming in second. As often happens in such discussions, I got quite a few direct emails and twitter messages – some in no uncertain terms – about why the particular metric I chose was not sufficiently nuanced or representative of the true picture. I freely admit: that’s true. In general, market observers know Oracle is not typically placed second overall – but the picture is more complex than a single ranking. My point was, and is, that it’s too easy to slip into a “who’s on top” mentality that obscures true market dynamics. In this post, I’ll dig a bit deeper, and describe what different approaches or categorizations show us – and what they don’t. Finally I’ll talk about how much this matters – and to whom. Read more of this post

For GoodData, SaaS Changes The Channel Model Too

Last time I mentioned GoodData, it was in passing, as I discussed YouCalc and other SaaS BI players. In the ensuing year, many other toes have been dipped into the water. I sat down with GoodData CEO and founder Roman Stanek and Marketing VP Sam Boonin this week to catch up on how it’s all going, and from where they sit, the news seems to look pretty good. With 40 employees, 25 customers since last November, and a funding round from the likes of Marc Andreesen and Tim O’Reilly, GoodData seems to be off to a GoodStart. And now it has a new initiative: free analytics for other SaaS players to expand its presence. Read more of this post

You Know You Have Big Data When…(Humor)

One of the more philosophical questions analysts like to ask is “What is Big Data?” It’s relative – it begs the question, “what’s big?” And that is a constantly moving number, and always assessed by comparison to the ridiculous amounts some companies work with. But Big Data as a concept in IT parlance today tends to mean something fairly specific, not just about size but also about composition and the nature of the processing. So I considered a serious attempt at a fairly rigorous discussion about the nature of the workload, structure of the data and the kinds of analytics that comprise what people think of as Big Data….and then I thought of Steve Martin, who would have considered this carefully and then looked into the camera and said “Naaaahh.” So I determined to emulate him and have a bit of fun instead, by crowdsourcing some help completing the sentence “You know you have Big Data when…” Here’s what some Twitter folks said> Some are funny, some more serious … Read more of this post

Oracle Sets Sights on BI Leadership. Has it Picked the Right Target?

Oracle is not first in BI, and wants to change that – that was the clear message of a well executed, multi-site “real plus virtual” event with top executives showing off the result of a multi-year effort to rationalize and integrate a set of leading but overlapping components into a seamless suite. Oracle Business Intelligence Enterprise Edition 11g (OBIEE) deserves the accolades it has already received from analysts who welcomed its announcement – it makes bold and serious bets on effective centralized metadata administration, data integration/ unification and optimized analytic architecture, collaboration, globalization, mobile device support, and a powerful link to action that will be most effective (unsurprisingly) with its own business applications. While it misses some pieces – fully integrated in-memory processing, SaaS and cloud support among them – these will be forthcoming, and Oracle is clearly committed to a quicker release cycle now that the thorny internal politics around legacy products seem to be resolved. But its competitive focus may be misdirected; while SAP is still ahead in market share, IBM is the bigger threat in the marketplace.

Read more of this post

EMC Buys Greenplum – Big Data Realignment Continues

EMC’s acquisition of Greenplum, announced today as a cash transaction, reaffirms the obvious: the Big Data tsunami upends conventional wisdom. It has already reshaped the market, spawning the most ferment in the RDBMS (and non-R DBMS via the noSQL players) space in years. When I first posted on Greenplum over a year ago, I said that

Open source + capital has created an intriguing new model of rapid innovation in “mature” markets, and the database space - like BI – is not a done deal. It is indeed possible to escape the gravity well, if you execute. Greenplum is getting it done, and is among the new stars to watch.”

Why the open source reference? Greenplum uses a parallelization layer atop PostgreSQL (like Aster, another of the new breed of ADBMS.)

Now EMC has written the next chapter in that story. In the process, it adds a new piece (after literally dozens of others in the past few years) to its own portfolio, which already includes unstructured data (via Documentum) and virtualization (via VMWare), layered in among the industry-leading storage and information management pieces. Disruptive? You bet. Is EMC finished? I doubt it. Candidates? BI tools, ETL, MDM, data integration come to mind. Losers? At least one big one. Read on. Read more of this post

Microsoft’s Parallel DW – Still Waiting

Microsoft’s SQL Server Parallel Data Warehouse (PDW) has been eagerly awaited for a long time. It still is. Though much of the news at the BI Conference running in parallel with TechEd in New Orleans (discussed here) was generally quite good, the PDW story was much less so. It’s late, and it’s not all there. Read more of this post

Follow

Get every new post delivered to your Inbox.

Join 110 other followers