Big Data A definition Big data is the
Big Data: A definition • Big data is the realization of greater intelligence by storing, processing, and analyzing data that was previously ignored due to the limitations of traditional data management technologies • Big data is the merging of several data sets whose complexity becomes greater than the sum of the individual data sets.
Big Data means that Technology Makes it Possible to Analyze ALL Available Data on any phenomena Cost effectively manage and analyze all available data in its native form unstructured, streaming Social Media Website Billing ERP CRM RFID Network Switches
Lots of data • 2. 5 quintillion bytes of data are generated every day! • A quintillion is 1018 • Data come from many quarters. • Social media sites • Sensors • Digital photos • Business transactions • Location-based data Source: IBM http: //www-01. ibm. com/software/data/bigdata/
The four dimensions of Big Data • Volume: Large volumes of data • Velocity: Quickly moving data • Variety: structured, unstructured, images, etc. • Veracity: Trust and integrity is a challenge and a must and is important for big data just as for traditional relational DBs Source: IBM http: //www-01. ibm. com/software/data/bigdata/
Traditional Approach Big Data Approach Structured & Repeatable Analysis Iterative & Exploratory Analysis IT Narrow Use: Delivers a platform to enable creative discovery Determine what questions to ask Reversing the usual Paradigm IT Business Structures the data to answer that question Explores what questions could be asked Garbage In Garbage Out New Discoveries
Current Situation I need to evaluate the possible relationship between variables, X, Y , Z Analyst OK. We have to evaluate a lot of statistics, set the correct database indexes and partitioning. It will take us 5 days. Go away IT
Okay, I went away and now I am back Done. You can run your analytical query. IT Analyst After 5 days. . .
Great. I can see here some nice correlations. Now I need to look at it from the different perspective. Ohhh, welcome dear friend. Understand. So, it’s …. another 5 days of our work IT Analyst You guys suck. I am outta here% After 1 day. . .
And now with Some Magic Compute Box
I need to evaluate the possible relationship between X, Y, Z. I will use the Magic Box Analyst IT
Great. I can see here some nice correlations. Now I need to look at it from a different perspective. With the Magic Box I can run the query immediately. Go away IT IT Analyst IT can do something else – much more useful – if that is even possible … After 10 minutes. . .
Built-In Expertise Makes This as Simple as an Appliance § Dedicated device § Optimized for purpose § Complete solution § Fast installation § Very easy operation § Standard interfaces § Low cost 1
Real Magic Box results using T-Mobile Czech Rep. Original Platform Netezza 2 hours 1 minute Payment discipline of current month invoices 33 minutes 17 seconds Overdue Debt of Invoices – in Current Month 10 hours 23 seconds Average Monthly Invoice Figures 50 minutes 38 seconds Workflow Reporting Invoicing and Payments reporting RESPONSE TIME MASSIVELY IMPROVED
Big Data Conundrum • Problems: • Although there is a massive spike available data, the percentage of the data that an enterprise can understand is on the decline • The data that the enterprise is trying to understand is saturated with both useful signals and lots of noise Source: IBM http: //www-01. ibm. com/software/data/bigdata/
- Slides: 16