Has HDFS joined MapReduce in the emerging “legacy Hadoop project” category, continuing the swap-out of components that formerly answered the question “what is Hadoop?” Stores for data were certainly a focus at Strata/Hadoop World in NY, O’Reilly’s well-run, well-attended, and always impactful fall event. The limitations of HDFS, including its append-only nature, have become inconvenient enough toContinue reading “Strata Standards Stories: Different Stores For Different Chores”
Tag Archives: Hive
Hadoop Projects Supported By Only One Distribution
The Apache Software Foundation has succeeded admirably in becoming a place where new software ideas are developed: today over 350 projects are underway. The challenges for the Hadoop user are twofold: trying to decide which projects might be useful in big data-related cases, and determining which are supported by commercial distributors. In Now, What is Hadoop? And What’s Supported? I list 10 supportedContinue reading “Hadoop Projects Supported By Only One Distribution”
Strata Spark Tsunami – Hadoop World, Part One
New York’s Javits Center is a cavernous triumph of form over function. Giant empty spaces were everywhere at this year’s empty-though-sold-out Strata/Hadoop World, but the strangely-numbered, hard to find, typically inadequately-sized rooms were packed. Some redesign will be needed next year, because the event was huge in impact and demand will only grow. A few ofContinue reading “Strata Spark Tsunami – Hadoop World, Part One”
Hadoop Summit Recap Part Two – SELECT FROM hdfs WHERE bigdatavendor USING SQL
Probably the most widespread, and commercially imminent, theme at the Summit was “SQL on Hadoop.” Since last year, many offerings have been touted, debated, and some have even shipped. In this post, I offer a brief look at where things stood at the Summit and how we got there. To net it out: offerings todayContinue reading “Hadoop Summit Recap Part Two – SELECT FROM hdfs WHERE bigdatavendor USING SQL”
Apache Hadoop 1.0 Doesn’t Clear Up Trunks and Branches Questions. Do Distributions?
In early January 2012, the world of big data was treated to an interesting series of product releases, press announcements, and blog posts about Hadoop versions. To begin with, we had the announcement of Apache version 1.0 at long last, in a press release. Although there were grumblings here and there in the twittersphere thatContinue reading “Apache Hadoop 1.0 Doesn’t Clear Up Trunks and Branches Questions. Do Distributions?”
Hadoop Distributions And Kids’ Soccer
The big players are moving in for a piece of the big data action. IBM, EMC, and NetApp have stepped up their messaging, in part to prevent startup upstarts like Cloudera from cornering the Apache Hadoop distribution market. They are all elbowing one another to get closest to “pure Apache” while still “adding value.” NumerousContinue reading “Hadoop Distributions And Kids’ Soccer”
At Oracle, Closed May be the New Open. Whither MySQL?
I hope I can be forgiven the cute headline. It speaks to a series of events that were heard in Oracle Open World messaging, where the word “open” appeared much less frequently than in years past. Oracle is fortifying its borders, opening new fronts in its market battles, and slowly closing itself off from someContinue reading “At Oracle, Closed May be the New Open. Whither MySQL?”