One of the more philosophical questions analysts like to ask is “What is Big Data?” It’s relative – it begs the question, “what’s big?” And that is a constantly moving number, and always assessed by comparison to the ridiculous amounts some companies work with. But Big Data as a concept in IT parlance today tends to mean something fairly specific, not just about size but also about composition and the nature of the processing. So I considered a serious attempt at a fairly rigorous discussion about the nature of the workload, structure of the data and the kinds of analytics that comprise what people think of as Big Data….and then I thought of Steve Martin, who would have considered this carefully and then looked into the camera and said “Naaaahh.” So I determined to emulate him and have a bit of fun instead, by crowdsourcing some help completing the sentence “You know you have Big Data when…” Here’s what some Twitter folks said> Some are funny, some more serious …
You know you have Big Data when….
… you get a call from the utility company asking you not to run ‘that brownout query’ again. (@aristippus303 at Datawatch)
… your IT spends more time purchasing storage capacity than making sure the business has the data they need – @judyiko (Informatica)
.,. EMC name a new product after you (@aristippus303 at Datawatch)
… it piles up so high that it disappears into the clouds (@evertlammerts – I assume pun was intended?)
… the SAN undergoes gravitational collapse and you get cited by OSHA for an unlicensed singularity. (@datamartist)
… a query is long enough to require a couple of DBA generations to see it returning first data. (@Stray_Cat)
… your datacenter manager divides time between installing a new NAS in the kitchen and googling for vacant aircraft hangars. (@alanjharrison)
And a few of mine:
… you conduct an audit, including external files, and add more in to the databases than you take out.
… you think Flomax is a new ETL product.
… the first item on your bucket list is “finish data model.”
… you’ve never gotten to the “Reduce” part.
… your Dad won’t let you have the keys to the table you want to join to because he’s still doing the schema update he started on your birthday. No, your BIRTH day.
OK – that’s way more than enough. Don’t you have a schema to update? Get back to work. If you get bored, send me some more.
4 thoughts on “You Know You Have Big Data When…(Humor)”
– You find that your ERP product generates 200 GB of LOG files.. PER DAY!!! And your users do not even create 1% of that in REAL data….
– Your ERP Vendor cannot tell you what part of these LOG files is “critical” and what is “nice to have, just in case…”
– A simple audit finds as many copies of this years 9 MB “Strategy Presentation.PPT” as there are employees.
– People are allowed to put daily “versions” of their personal 2GB PST files on the file-server.
– When the daily extract of the Production Mainframe Database creates Distributed Databases that contain (when combined) 200 times the amount of data as the original production database.
All are real cases, not jokes……
Hardly humorous, I’m afraid, Marcel. But serious business indeed – and good pointers to some tips to keep volumes down. Do you care to share the name of the ERP product in question?
Hi Merv. You know you have Big Data when …
1. Your storage is inadequate before you’ve installed it.
2. The court requests all video from all security cameras in your main building for the last 3 years; and you can comply.
(this is pretty close to real-world stuff from my user interviews)
Thanks for these, Wayne. Not humorous perhaps, but very true.