Garbage In Garbage Out Why data quality matters

  • Slides: 33
Download presentation
Garbage In Garbage Out Why data quality matters FEI Financial Leadership Summit, Houston, TX

Garbage In Garbage Out Why data quality matters FEI Financial Leadership Summit, Houston, TX May 21, 2018

An Explosion of Data

An Explosion of Data

A tsunami of data arrives every 60 seconds With today’s explosion of data, how

A tsunami of data arrives every 60 seconds With today’s explosion of data, how can companies effectively capture, analyze and ultimately use the vast amount of information that is now available? 29 M+ Messages on Whats. App EVERY 60 1 M+ 3 M+ Photos on Whats. App Shared items Facebook 350 K+ 2 M+ SECONDS on the Internet Tweets on Twitter Calls on Skype 50+ New reviews Yelp 120+ 156 M+ 42 K+ New accounts Linked. In Sent Emails Photos & Posts Instagram 200+ 87 K+ 25 K+ Event Tix Eventbrite Hours video viewed Netflix Posts Tumbler 700 K+ 400+ 25 K+ 2. 4 M+ 800 K+ 43 K Hours of video viewed You. Tube Hours of video uploads You. Tube Uploads to Dropbox Hours of Music Pandora 175 K+ 5. 5 K+ 16. 5 K+ 3. 8 M+ 80+ 500+ Mins. audio chat Check-ins We. Chat Four. Square GIFs sent Messenger Video views Vimeo Snaps on Snapchat Searches Google New Domains Source: GO-Globe THE ANSWER Digital tools Automation and robotics Tech savvy people Downloaded Apps Data analytics © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. Cognitive computing 4

Data Analytics and Data Quality

Data Analytics and Data Quality

What is data? All firms have raw data; however, companies that process raw data

What is data? All firms have raw data; however, companies that process raw data into knowledge create a valuable organizational asset. lue Information Va Analytics Knowledge Raw Data © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 6

The business analytics spectrum from a process owner perspective Optimization What’s the best way

The business analytics spectrum from a process owner perspective Optimization What’s the best way for it to happen? Competitive Advantage What will happen next? Offensive Play What if the trend continues? Why is this happening? Alerts Query / Drill down Ad hoc Reports Standard Reports Predictive Modeling Forecasting / Extrapolation Statistical Analysis What actions are needed? Where exactly is the problem? Defensive Play How many, how often, where? What happened? Degree of Intelligence Source: Adapted from Davenport, “Competing on Analytics” © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 7

Extracting Value From Data & Analytics PRESCRIPTIVE Culture Change Analytics Capabilities Data Integration DATA

Extracting Value From Data & Analytics PRESCRIPTIVE Culture Change Analytics Capabilities Data Integration DATA TRANSFORMATION Development of models incorporating predictive analytics to optimize decisions PREDICTIVE Use of historical data and drivers analysis with advanced analytical modeling to understand potential ranges of outcomes DIAGNOSTIC Retrospective analysis to understand drivers of the outcome that occurred DESCRIPTIVE Understanding current conditions † Hostmann, Bill. Best Practices in Analytics: Integrating Analytical Capabilities and Process Flows. Gartner, 2012 * KPMG point of view © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 8

Structured data United States Subject Estimate HOUSEHOLDS BY TYPE Total households Family households (families)

Structured data United States Subject Estimate HOUSEHOLDS BY TYPE Total households Family households (families) With own children under 18 years Married-couple family With own children under 18 years Male householder, no wife present, family With own children under 18 years Female householder, no husband present, family With own children under 18 years Nonfamily households Householder living alone 65 years and over Margin of Error 115, 969, 540 76, 509, 262 33, 612, 973 55, 754, 450 22, 423, 949 5, 578, 212 2, 697, 636 15, 176, 600 8, 491, 388 39, 460, 278 32, 256, 217 11, 513, 067 Percent +/-150, 555 +/-122, 329 +/-72, 569 +/-142, 830 +/-76, 687 +/-43, 046 +/-28, 396 +/-53, 498 +/-40, 891 +/-82, 851 +/-90, 665 +/-45, 037 Percent Margin of Error 115, 969, 540 66. 0% 29. 0% 48. 1% 19. 3% 4. 8% 2. 3% 13. 1% 7. 3% 34. 0% 27. 8% 9. 9% (X) +/-0. 1 +/-0. 1 Households with one or more people under 18 years Households with one or more people 65 years and over 37, 555, 689 30, 193, 187 +/-70, 588 +/-54, 865 32. 4% 26. 0% +/-0. 1 Average household size Average family size 2. 64 3. 25 +/-0. 01 (X) (X) © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 9

Semi-structured data © 2018 KPMG LLP, a Delaware limited liability partnership and the U.

Semi-structured data © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 10

Unstructured data Over 80% of all data is unstructured Unstructured data grows 100 x

Unstructured data Over 80% of all data is unstructured Unstructured data grows 100 x every 10 years © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 11

Importance of data quality age r e v A d EEX x. Cc Ee.

Importance of data quality age r e v A d EEX x. Cc Ee. Ll le. L n Et. N r T Very p oor o Po Goo l Leve © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 12

Garbage in -> Garbage out IN OUT = © 2018 KPMG LLP, a Delaware

Garbage in -> Garbage out IN OUT = © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 13

Dimensions of data quality Accuracy Completeness Relevance Confidentiality Data Quality Integrity Availability © 2018

Dimensions of data quality Accuracy Completeness Relevance Confidentiality Data Quality Integrity Availability © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 14

Evaluation criteria for data quality Field-specific rules — Null and duplicate values are present

Evaluation criteria for data quality Field-specific rules — Null and duplicate values are present when not expected — Values are not of an expected type (text, numeric, date/time, etc. ) and expected format § All dates are MM/DD/YYYY § All Purchase Orders follow standard naming convention: POA##### — Fields do not contain expected values (e. g. , only valid U. S. zip codes) — Numeric fields contain gaps in sequence — Numeric and date fields fall outside defined ranges Referential integrity Parent/Child Relationships — Transactions with vendors not in vendor master file — Assets categorized to an undefined depreciation method © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 15

Evaluation criteria for data quality (continued) Expected correlations Business rules to ensure alignment across

Evaluation criteria for data quality (continued) Expected correlations Business rules to ensure alignment across date fields/files Address alignment across street, city, state, and zip — Credit card number aligns with issuing credit card company — SKU number of product aligns with store type § Clothing sales in automotive stores — Expected Credit/Debit account pairings © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 16

Key risk points in data collection and analysis Data collection Data analysis — Completeness,

Key risk points in data collection and analysis Data collection Data analysis — Completeness, uniqueness, accuracy, and validity are lost in manipulation, summarization, and other analytics performed — Uniqueness — Accuracy — Validity — Ensure that totals are continually reconciled between iterations of manipulated dataset © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 17

Addressing risks to quality – Data collection Completeness Accuracy Validity Control Totals Reconciliations to

Addressing risks to quality – Data collection Completeness Accuracy Validity Control Totals Reconciliations to Source System Data Format and Masking Checks Record Counts Duplicate Field Checks Field-Value Correlation Checks Sequential Value Checks Duplicate Record Checks Referential Integrity Checks © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 18

A transforming business landscape

A transforming business landscape

Changing the way business is done The explosion of data in business has fostered

Changing the way business is done The explosion of data in business has fostered unprecedented advances in digital processing power and the capacity to support decision making across multiple activities and operations. $226 B Expected global market for robotics by 20211 +36% Potential productivity improvement with adaptation of these technologies 2 $46 B Anticipated revenue through 20203 with significant investment in cognitive +54% Will achieve a CAGR over the next several years 3 +75% Of workers by 2019, will have access to Intelligent Personal Assistants to augment their skills by 20204 1. Source: Research and Markets, March 2017 2. Source: KPMG Technology Innovation Survey, November 2016. 3. Source: IDC, Worldwide Spending on Cognitive and Artificial Intelligence Systems Forecast to Reach $12. 5 Billion This Year, According to New IDC Spending Guide, Press Release, April 3, 2017. 4. Source: IDC, IDC Future. Scape: Worldwide Analytics, Cognitive/AI, and Big Data 2017 Predictions, Doc. #US 41866016, November 2016. © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 20

CEO views on disruptive technologies 61% 6 in 10 View technological disruption as an

CEO views on disruptive technologies 61% 6 in 10 View technological disruption as an opportunity, not a threat Concerned about integrating cognitive processes and AI into their business 57% Say their businesses don’t have innovative processes to respond to rapid disruption Other insights 49% Are concerned about the integrity of data 72% Are disrupting their own sectors 81% Placing a greater focus on trust, value and culture 92% View the US as the top market for new growth Source: KPMG U. S. CEO Outlook 2017 Survey, June 2017 © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 21

IBM Watson use case Commercial Mortgage Loan Audit (CMLA) Prototype

IBM Watson use case Commercial Mortgage Loan Audit (CMLA) Prototype

So what is cognitive technology? Cognitive technology employs a range of capabilities that can:

So what is cognitive technology? Cognitive technology employs a range of capabilities that can: Perceive Interpret sensory input beyond traditional data Reason Hypothesize and weigh supporting evidence Learn Improve confidence levels with experience The analytical capabilities of cognitive technology are well-suited to the expanding data volumes and automated processes prevalent in today’s audit environment. © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 24

Enabling more effective decision making Cognitive technology and data and analytics are highly complementary

Enabling more effective decision making Cognitive technology and data and analytics are highly complementary capabilities that support complex decision making. 001001 010100 110011 Analytics Logic Analyze Measure Calculate Monitor Compare Optimize Prevent Prescribe Learn Infer Hypothesize Assess Debate Probability Analogize Options Cognitive Rationale Help me do THINGS right Help me do the RIGHT things © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 25

Commercial mortgage loan audit prototype Objective is for IBM Watson to process documentation for

Commercial mortgage loan audit prototype Objective is for IBM Watson to process documentation for each loan with relevant external information and KPMG IP Through training of IBM Watson, key elements impacting the loan risk rating are identified Utilizing proprietary loan risk assessment process, IBM Watson makes recommendation of the risk grade Loan grade accompanied by Confidence level assessment Supporting information* Our strategic vision — Future applicability to virtually any risk assessment platform and in any industry, relying on unstructured data and human judgment — IBM Watson training can be used in other projects and Natural Language Processing (NLP) training moved to other technologies — Ability to enable straight through processing, after sufficient training * Including full history and documentation of how Watson’s analysis was conducted showing the evidence and logic used to arrive at specific conclusions. © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 26

CMLA – Journey to cognitive automation Today Manual processes are limited to a subset*

CMLA – Journey to cognitive automation Today Manual processes are limited to a subset* of the entire loan population. Credit file Understand the facts Translate into a loan rating Interpret a client-specific loan grading scale Prepare summary of findings * Approximately 40 -60 credit files © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 27

CMLA – Journey to cognitive automation Future Extract facts from credit file and other

CMLA – Journey to cognitive automation Future Extract facts from credit file and other sources We will be able to analyze larger, more complete data sets from selected loan portfolios up to, and including, the entire bank loan portfolio. Understand facts Assign weights to facts Weighting scale Loan Amount: $10 M Payment History: Weak 100 Purpose: re‑finance PSOR: Strong 90 Collateral: A properties Appraised value: $100 M Third-party information Collateral: Strong Guarantor: Weak 80 70 Translate into a loan rating KPMG and client scales aligned AAA AA A Auditor reviews potential exceptions KPMG Loan # rating Client rating 1 B B 2 C B 3 AAA Evidence © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 28

Learnings to date Data Outcomes Procurement, curation and maintenance of data is a significant

Learnings to date Data Outcomes Procurement, curation and maintenance of data is a significant organizational and operational commitment Drive quality Alignment Enhance the Of technology to the business need or challenge is paramount to success Investment In cognitive applications typically have longer development cycles and higher resource requirements Data visualization Such as charts and graphs used for reporting continue to be a challenge for some digital tools to process user experience Unleash deeper insights Need to start now The automation “train” has left the station and only those companies ready to invest in advanced digital capabilities will remain competitive in the future. © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 29

Digital transformation key considerations Don’t wait. Start your digital journey now! Identify your business

Digital transformation key considerations Don’t wait. Start your digital journey now! Identify your business challenge and align it with the right data and technology solution. Assess your data infrastructure and the data available to drive your digital strategy. Assess the ability of your IT infrastructure to support this advanced technology. Assess and enable your culture to accept and drive your digital evolution. Create an appropriate governance structure to maintain effective innovation. © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. 30

Questions

Questions

Thank you

Thank you

kpmg. com/socialmedia The information contained herein is of a general nature and is not

kpmg. com/socialmedia The information contained herein is of a general nature and is not intended to address the circumstances of any particular individual or entity. Although we endeavor to provide accurate and timely information, there can be no guarantee that such information is accurate as of the date it is received or that it will continue to be accurate in the future. No one should act on such information without appropriate professional advice after a thorough examination of the particular situation. © 2018 KPMG LLP, a Delaware limited liability partnership and the U. S. member firm of the KPMG network of independent member firms affiliated with KPMG International Cooperative (“KPMG International”), a Swiss entity. All rights reserved. The KPMG name and logo are registered trademarks or trademarks of KPMG International.