Webinar Big Data Predictive Analytics 101 Mike Gualtieri
Webinar Big Data Predictive Analytics 101 Mike Gualtieri, Principal Analyst September 7, 2012. Please call in at 10: 55 a. m. Eastern time Twitter: @mgualtieri
Outlook
Business intelligence is the top business app © 2012 Forrester Research, Inc. Reproduction Prohibited
Real-time analytics is rising fast © 2012 Forrester Research, Inc. Reproduction Prohibited
Data continues to grow © 2012 Forrester Research, Inc. Reproduction Prohibited
The past three years have shown an increased adoption of predictive analytics © 2012 Forrester Research, Inc. Reproduction Prohibited
“Knowledge is power. ” Francis Bacon (1561– 1626) Francis Bacon described a rational procedure to establish causation between phenomena based on induction. © 2012 Forrester Research, Inc. Reproduction Prohibited
“Big data can lead to an explosion of knowledge. . . ” - Big Data Predictive Analytics, Forrester Research
1 B Forrester estimates there will be more than 1 billion people using smartphones and tablets by 2016.
25 B Cisco IBSG predicts there will be 25 billion devices connected to the Internet by 2015.
7 B More people using more technology means more big data.
What exactly is big data?
Volume, velocity, and variety are measures of big data Volume • The amount of data in bytes (e. g. , kilobyte, megabyte, gigabyte, terabyte, etc. ) Velocity • The rate at which data is collected in bytes per second (e. g. , transactions per second) Variety © 2012 Forrester Research, Inc. Reproduction Prohibited • The formats of data (e. g. , structured, unstructured, binary)
It’s relative. It all comes down to how well you can handle it.
Big data activities are as important as the three V’s measures Store Process Access © 2012 Forrester Research, Inc. Reproduction Prohibited • Can you collect and store all your data? • Can you cleanse, enrich, calculate, translate, run algorithms against the data? • Can you retrieve, search, and visualize all your data?
What’s your big data score? Activities Measures Store Process Access Points Volume 5 5 3 13 Velocity 3 3 0 3 Variety 3 3 0 6 Your big data score: 22 5 = Not required or handled perfectly 3 = Handled but could be improved 1 = Handled poorly but with frequent negative business impact 0 = Current or future needs exist but are not handled. Source: Mike Gualteri, “What’s Your Big Data Score? ” Mike Gualtieri’s Blog For Application Development & Delivery Professionals, May 17, 2012 © 2012 Forrester Research, Inc. Reproduction Prohibited
What’s your big data score mean? Big data score Meaning Description 0– 15 Poor Your organization’s inability to handle your big data requirements threatens the business. 16– 30 Struggling Your organization is probably handling one measure of big data such as volume well, but ignoring others. 31– 40 Winning Your organization is winning more than it is losing. Handling big data can be a battle with continuous stream of new requirements. 41– 45 Perfect Don’t rest on your laurels. New big data requirements could overwhelm your architecture putting you down in the dumps. Source: Mike Gualteri, “What’s Your Big Data Score? ” Mike Gualtieri’s Blog For Application Development & Delivery Professionals, May 17, 2012 © 2012 Forrester Research, Inc. Reproduction Prohibited
Big data Do you have the tools and technologies to handle big data?
It is not surprising that transactional data is most popular for big data “What types of data/records are you planning to analyze using big data technologies? ” (Multiple responses accepted) Base: 60 IT professionals; Source: June 2011 Global Big Data Online Survey © 2012 Forrester Research, Inc. Reproduction Prohibited
Why do Forrester clients consider or implement big data? “What are the main business requirements or inadequacies of earlier-generation BI/DW/ET technologies, applications, and architecture that are causing you to consider or implement big data? ” Base: 60 IT professionals; Source: September 20, 2011, “How Forrester Clients Are Using Big Data” Forrester report © 2012 Forrester Research, Inc. Reproduction Prohibited
Most Forrester clients are using commercial technology for big data “What technology do you use for big data applications? ” (multiple responses accepted) Base: 60 IT professionals; Source: September 20, 2011, “How Forrester Clients Are Using Big Data” Forrester report © 2012 Forrester Research, Inc. Reproduction Prohibited
Predictive analytics can find meaning in big data.
Just a few examples of how organizations use predictive analytics today Keep customers • Mobile carriers prevent customers from switching to another carrier. Dazzle customers • Netflix recommendation engine • Google search engine Save lives • Reduce hospital re-admittance • Human genome markers Prevent breakdowns • Replace parts before they shutdown the production line Sell more • Retail product placement/groupings Find opportunities © 2012 Forrester Research, Inc. Reproduction Prohibited • Energy exploration
Predictive analytics finds a model that acts like a formula to find the answer Classification Clustering Association © 2012 Forrester Research, Inc. Reproduction Prohibited • Predict an item class • This customer is likely to churn. • Find groups • Some people love dubstep music. • Items that occur together • People buy diapers and beer on Thursdays.
Predictive models can be represented in many different ways depending on the technique used Formula Decision trees Code Combination of any of the above © 2012 Forrester Research, Inc. Reproduction Prohibited
Big data comes in many varieties Structured text • Data described by a schema • Relational database, XML, delimited flat file, system events Unstructured text • Free-form text • Email, documents, tweets, blog comments, Facebook status, genome Binary © 2012 Forrester Research, Inc. Reproduction Prohibited • Audio, images, video • Surveillance cameras, geological survey maps, Siri voice
Predictive analysis is powered by statistical and machine learning algorithms K-means clustering Association rules Random forests Mars regression splines Boosting trees CHAID Cluster analysis Feature selection Independent components analysis Kohonen Networks (SOFM) Linear and logistic regression Naïve Bayesian classifiers Optimal binning Partial least squares Response Optimization Root cause analysis Neural networks Social network analysis (SNA) Support vector machines Natural language processing This is just a sample. There are hundreds of algorithms, variations, and combinations. © 2012 Forrester Research, Inc. Reproduction Prohibited
The predictive analytics process must be continuous to insure effectiveness Understand data Monitor Prepare data Business goal Deploy Model Evaluate © 2012 Forrester Research, Inc. Reproduction Prohibited 28
The right data and right talent are the key to predictive analytics success Source: August 8, 2012, “The State Of Customer Analytics 2012” Forrester report © 2012 Forrester Research, Inc. Reproduction Prohibited 29
Real-time predictive is the next big differentiation opportunity in customer engagement © 2012 Forrester Research, Inc. Reproduction Prohibited 30
Predictive analytics has its limits.
Predictive analytics has limits There are lots of stock price data, but causative data is elusive. Note: Red line is AAPL Apple stock price; blue line is RIMM Research In Motion stock price. © 2012 Forrester Research, Inc. Reproduction Prohibited 32
Predictive analytics has limits Can a butterfly flapping its wings in Asia drastically alter the weather in the Gulf Of Mexico? © 2012 Forrester Research, Inc. Reproduction Prohibited
Predictive analytics has limits There have only been 56 presidential elections and 44 presidents. © 2012 Forrester Research, Inc. Reproduction Prohibited
What do great predictive analytics use cases have in common? Evidence-based methods don’t exist or are sub-optimal. Relevant data is available. The environment changes with moderate frequency. The business outcome is significant. © 2012 Forrester Research, Inc. Reproduction Prohibited
Predictive analytics is hard to do Causative data Data analysts Modeling tools Model deployment © 2012 Forrester Research, Inc. Reproduction Prohibited • The right data to establish a cause and effect • Enough data to be significant • Understand business outcome • Create hypothesis about data mining algorithms that will create predictive rules • Data preparation • Discovery (visualization, machine learning algos) • Evaluation and optimization • Data to feed model • Model execution (embedded, callable service)
Big data reinvigorates the use of predictive analytics to achieve business outcomes Big data means more potentially causative variables. Big data means more experience data for training algorithms. © 2012 Forrester Research, Inc. Reproduction Prohibited
Tools Big data predictive analytics solutions range from coding tools to specific business solutions.
General purpose enterprise big data predictive analytics tools must have an extensive feature set Architecture Data Discovery Evaluation and optimization Deployment User tools Integration, solutions, standards, and extensibility Innovation © 2012 Forrester Research, Inc. Reproduction Prohibited
Data import and preprocessing User-defined functions R Internet API interface XML parsing Iterative data processing Grant awards to homeless veterans FY 09 Data: Data. gov Analysis: Drew Conway R is an open source programming language that can be used for predictive analytics. 40
Churn Can you prevent Melissa from switching to a competitive mobile plan?
Churn Prepare data from different sources.
Churn Find the predictive variables.
Churn Find a predictive model.
Churn Evaluate the effectiveness of the model.
Million Song Dataset How can you provide Melissa with nearly perfect song recommendation?
MSD Prepare data from 48 million songs listened to by 1. 2 million users.
MSD Create a bipartite graph to find out what bands users like.
MSD These are all the users who listen to Aerosmith.
MSD Find communities of listeners.
MSD View a sample list of recommendations for a specific user sorted by support and confidence.
Big data predictive analytics will make you smarter.
“Big data can lead to an explosion of knowledge that firms can use to make smarter decisions and create differentiated customer experiences. ” — Big Data Predictive Analytics, Forrester Research
“Knowledge is power profit. ” Francis Bacon (1561– 1626)
Forrester Wave™: Big Data Predictive Analytics Solutions planned publication Q 4 2012 Forrester methodology limited this Forrester Wave to ten vendors. Many vendor solutions and combinations of solutions exist for a variety of use cases. Publication of the Forrester Wave is expected in Q 4 2012. Schedule an inquiry to discuss your unique circumstances. © 2012 Forrester Research, Inc. Reproduction Prohibited
Thank you Mike Gualtieri mgualtieri@forrester. com Twitter: @mgualtieri
Big data predictive analytics solutions range from coding tools to specific business solutions Alpine Data Labs Zementis Alteryx Google Prediction API Pentaho Angoss R Matlab EMC KNIME Rapid – I Opera Solutions Revolution Analytics Teradata SAS IBM FICO Pegasystems Cetus Oracle KXEN Microsoft Salford Pitney Bowes Statsoft Fuzzy. Logix SAP Weka TIBCO Mahout © 2012 Forrester Research, Inc. Reproduction Prohibited
- Slides: 57