Aspirational Marketing and Enterprise Data Hubs

In the Hadoop community there is a great deal of talk of late about its positioning as an Enterprise Data Hub. My description of this is “aspirational marketing;” it addresses the ambition its advocates have for how Hadoop will be used, when it realizes the vision of capabilities currently in early development. There’s nothing wrong with this, but it does need to be kept in perspective. It’s a long way off.

–more–

2013 Data Resolution: Avoid Architectural Cul-de-Sacs

I had an inquiry today from a client using packaged software for a business system that is built on a proprietary, non-relational datastore (in this case an object-oriented DBMS.) They have an older version of the product – having “failed” with a recent upgrade attempt.

The client contacted me to ask about ways to integrate this OODBMS-based system with others in their environment. They said the vendor-provided utilities were not very good and hard to use, and the vendor has not given them any confidence it will improve. The few staff programmers who have learned enough internals have already built a number of one-off connections using multiple methods, and were looking for a more generalizable way to create a layer for other systems to use when they need data from the underlying database. They expect more such requests, and foresee chaos, challenges hiring and retaining people with the right skills, and cycles of increasing cost and operational complexity.
My reply: “you’re absolutely right.”

Amazon Redshift Disrupts DW Economics – But Nothing Comes Without Costs

At its first re:Invent conference in Late November, Amazon announced Redshift, a new managed service for data warehousing. Amazon also offered details and customer examples that made AWS’  steady inroads toward enterprise, mainstream application acceptance very visible.

Redshift is made available via MPP nodes of 2TB (XL) or 16TB (8XL), running Paraccel’s high-performance columnar, compressed DBMS, scaling to 100 8XL nodes, or 1.6PB of compressed data. XL nodes have 2 virtual cores, with 15GB of memory, while 8XL nodes have 16 virtual cores and 120 GB of memory and operate on 10Gigabit ethernet.

Reserved pricing (the more likely scenario, involving a commitment of 1 year or 3 years) is set at “under $1000 per TB per year” for a 3 year commitment, combining upfront and hourly charges. Continuous, automated backup for up to 100% of the provisioned storage is free. Amazon does not charge for data transfer into or out of the data clusters. Network connections, of course, are not free  – see Doug Henschen’s Information Week story for details.

This is a dramatic thrust in pricing, but it does not come without giving up some things.

More…

Guest Post: Leading the Logical Data Warehouse Charge Has its Challenges

From my colleague Mark Beyer, who speculates about how leadership in moving toward the logical data warehouse (LDW) will be received: 

The logical data warehouse is already creating a stir in the traditional data warehouse market space. Less than 5% of clients with implemented warehouses that we speak with are pursuing three or more of the six aspects of a logical warehouse: 

  • repositories
  • data virtualization
  • distributed processes
  • active auditing and optimization
  • service level negotiation
  • ontological and taxonomic metadata

That means we are in a very early stage regarding the adoption trend, and vendors who are aggressively moving toward it are ahead of their customers.

..more…

Mark Beyer, Father of the Logical Data Warehouse, Guest Post

Another guest post, this time from my colleague and friend Mark Beyer.

My name is Mark Beyer, and I am the “father of the logical data warehouse”. So, what does that mean? First,  if like any father, you are not willing to address your ancestry with full candor you will lose your place in the universe and wither away without making a meaningful contribution. As an implementer in the field, I was a student and practitioner of both Inmon and Kimball. I learned as much or more from my clients and my colleagues during multiple implementations as I did from studying any methodology. My Gartner colleagues challenged my concepts and helped hammer them into a comprehensive and complete concept. Simply put, I was willing to consider DNA contributions from anyone and anywhere, but through a form of unnatural selection, persisted in choosing to include the good genes and actively removing the undesirable elements.

more…

IBM Fills Out Netezza Lineup With High Capacity Appliance

In the months since IBM closed its Netezza acquisition, the data warehouse appliance pioneer has been busy, if the announcements at this week’s Enzee are any indication. An enthusiastic crowd – 1000 strong – heard CEO Jim Baum deliver the news: new hardware, software and partnerships.The biggest news was The Appliance Formerly Known As Cruiser, now known as the Netezza High Capacity Appliance (HCA). A wag made up some t-shirts bearing the acronym TAFKAC and did quite well. IBM is aiming to push the size perception for Netezza higher. How high? Half a PB in a rack. You can scale it to 10PB.

more

IBM Acquires Netezza – ADBMS Consolidation Heats Up

IBM’s bid to acquire Netezza makes it official; the insurgents are at the gates. A pioneering and leading ADBMS player, Netezza is in play for approximately $1.7 billion or 6 times revenues [edited 9/30; previously said “earnings,” which is incorrect.] When it entered the market in 2001, it catalyzed an economic and architectural shift with an appliance form factor at a dramatically different price point. Titans like Teradata and Oracle (and yes, IBM) found themselves outmaneuvered as Netezza mounted a steadily improving business, adding dozens of new names every quarter, continuing to validate its market positioning as a dedicated analytic appliance. It’s no longer alone there; some analytic appliance play is now in the portfolio of most sizable vendors serious about the market. Read more of this post

Kalido “Cascades” Continue Cadence on Designed DW Development

Kalido‘s ongoing evangelization of automation for governed, designed data warehouses has delivered fine results for the small, Massachusetts-based firm. In a recent conversation, the team shared recent results: a profitable fiscal year, with a Q4 that was up 35% and momentum that carried into the traditionally slow Q1 with 25% year over year growth. Since I last discussed Kalido at the time of its virtual conference a year ago, new name sales in the US and Europe as well as add-on business in existing accounts are a healthy sign . New partnerships, new data source support,  and a new release all are likely to sustain and even increase the momentum in the  autumn and winter selling seasons. Read more of this post

More TDWI Notes – ParAccel Rolling On, HP Stalled, Vertica Leading Insurgents

On my second day at TDWI, I was in meetings all day – events like this are a great opportunity for analysts to catch up with many of the companies they follow at one time, and this particular one was packed with sponsors. Congrats to the folks who sell sponsorships – they had a packed exhibit hall, and a lot of very interested attendees. I got a chance to chat at a few booths (all buzzing), ask a few attendees some real-world questions (and was asked some surprising ones myself), and get a sense of the workload in the trenches (heavy and growing.)

Read more of this post

Informatica Re-Factors the Value Chain for the Cloud

Informatica’s cloud ambitions continue and deepen with each new release. In the years since its 2006 launch, Informatica Cloud, the strategic initiative launched to bring Informatica’s data integration assets to the cloud,  has won salesforce.com’s Best of AppExchange award for 2008 and 2009, added other cloud-based applications as targets, and most significant, signed up 650 clients. Customers like Qualcomm and Toshiba are syncing their SaaS apps with on-premise data, enhancing compliance, and extending their BI capabilities.  In a recent conversation, Darren Cunningham, Vice President, Cloud Marketing told me that Informatica is processing over 30,000 jobs per day, involving over 6.5B rows of data per month. Read more of this post

Follow

Get every new post delivered to your Inbox.

Join 17,644 other followers