Data Quality Opportunities Data and Examples 1 Data
- Slides: 26
Data Quality: Opportunities, Data, and Examples 1
Data Woes We are agents of CHANGE! The Kübler-Ross grief cycle …roller-coaster ride of activity and passivity as the person wriggles and turns in their desperate efforts to avoid the change. 2
3
Better and More Data – Level of analysis • Take a quick look at what/why use data • Linking data from disparate and third party sources – Explore data types – Typical issues & Tricks • • Cross validation and sourcing Reverse Look-up GIS layering Backfill from text correlated to codes – Information from operations • Text analytics 4
General Organizational Overview An information business focused on risk taking. Make. Sell. Serve. Sales and Distribution Underwriting Risk Selection and Pricing Portfolio Management Premium Adequacy Billing and Collections Management 5 Producer Segmentation Market Planning Revenue Forecasting Cross sell and Up sell Retention and Profitability Claims Payment Accuracy Claim Collaboration > Fraud Detection > Subrogation > Risk Transfer > 3 rd Party Deductible > Reinsurance Recoverable
Same Problems – Different Lines of Business • Personal – Auto, HO, Umbrella • Small Commercial – BOP, CPP • Middle Market Commercial – CPP w/GL, CP, Crime, CIM, • • • 6 B&M, WC, Auto Large Commercial Accounts Commercial Auto Workers Comp Umbrella/Excess Specialty Lines – D&O, EPL, E&O, Farm, FI
Data Types and Forms Structured data Semi-structured data Unstructured data Text Spatial Pictographic Graphic Voice Video 7
Multiple Data Systems which must be pulled together for analysis. Great opportunity for cross-validation and sourcing Vendors/Partners Archive, Legacy Systems Current System Claim Medical Data - Bill Review - PPO - Case Management - Paradigm Data External Data Policy Multiple Underwriting Systems ACTIONS Multiple States Billing Systems Finance Systems CRM Systems, other data • Identify Data Systems • Get right data from right systems • Overcome internal Organizational Barriers • Bridge to legacy systems and archived data • Augment to create rich data mining environment • Expect the need to negotiate for resources 8
Some typical external data sources and vendors Dun & Bradstreet Experian Bureau of Labor and Statistics Market Stance AM Best Equifax US Census Claritas Melissa Data ISO GIS vendors U&C Data sets Code Sets for ICD-s and CPT’s … 9
Data Glitches – historical and on-going Systemic changes to data not process related – Changes in data layout / data types – Changes in scale / format – Temporary reversion to defaults – Missing and default values – Gaps in time series 10
Process Reasons for poor data entry 11
Defining Issues-sample Source Data 1 -Define Issues 12
MORE ISSUES… Mapping across sources: Same Fact, Different Terms Data Element Concept Name: Country Identifiers Context: Definition: Unique ID: 5769 Conceptual Domain: Maintenance Org. : Steward: Classification: Registration Authority: Others Algeria Belgium China Denmark Egypt France. . . Zimbabwe Data Elements Name: Context: Definition: Unique ID: 4572 Value Domain: Maintenance Org. Steward: Classification: Registration Authority: Others 13 Algeria L`Algérie DZ DZA 012 Belgium Belgique BE BEL 056 China Chine CN CHN 156 Denmark Danemark DK DNK 208 Egypte EG EGY 818 France La France FR FRA 250 . . . . Zimbabwe ZW ZWE 716 ISO 3166 French Name ISO 3166 2 -Alpha Code ISO 3166 3 -Numeric Code ISO 3166 English Name
Data Filling • • • 14 Manual Statistical Imputation Temporal Spatial-temporal
Geographic Hierarchy 15
Deriving Data = Power Ø Ø Ø Ø Ø 16 Totals: Household Income Trends: Rate of Medical Bill Increases Ratios: Claims/Premium, Target/Median Friction: Level of inconvenience, ratio of rental to damage Sequences: Lawyer-Doctor, Auto-Life Policy Circumstances: Minimal Impact Severe Trauma Temporal: Loss shortly after adding collision Spatial: Distance to Service, proximity of stakeholders Logged: Progress Notes, Diaries, Ø Who did it, When, “Why”
Deriving Data = Power (Cont’d) Ø Ø Ø Ø Ø 17 Behavioral: Deviation from past usage, spike buying Experience Profiles: Vendor, Doctor, Premium Audit Channel: How applied, How reported, Service Chain Legal Jurisdiction: Venue Disposition, Rules Demographics: Working, Weekly wage, lost income Firmographics: Industry Class Code Vs Injuries Claimed Inflation: Wage, Medical, Goods, Auto, COLA Gov’t Statistics: Crime Rate, Employment, Traffic Other Stats: Rents, Occupancy, Zoning, Mgd Care
“Search” versus “Discover” Structured Data Unstructured Data (Text) 18 Search (goal-oriented) Discover (opportunistic) Data Retrieval Data Mining Information Retrieval Text Mining
Searching Input Value [Jim] Jimmy Jim James Word Replacement Lists Transformed Input Value [JAMES] 19 JAMES Returns “Similar Matches” All Records Found: Jimmy Jim James
Motivation for Text Mining • • Approximately 90% of the world’s data is held in unstructured formats (source: Oracle Corporation) Information intensive business processes demand that we transcend from simple document retrieval to “knowledge” discovery. 10% 90% 20 Structured Numerical or Coded Information Unstructured or Semi-structured Information
Convergence of Disciplines Example 21
Techniques for attacking text data: ØRules-based ØStatistical Text Analysis and Clustering ØLinguistic and Semantic Clustering ØSupport Vector Machines ØPattern Matching or other statistical algorithms ØNeural Networks ØCombination of methods from above Text is like a data iceberg 22
Claims processing – Progress notes and Diaries Service • Medical Management Staff • Special Investigation Unit • NICB • Vendor Management • Consulting Engineers • Hearing Representative • Structured Settlement Unit • Recovery Staff • Legal Staff 23 CLAIMS ADJUSTER • Home Office Staff • Field Office Claim Staff • Insured Risk Manager • Agent or Broker • Diary forward – “call Dr Jones next week” • Business Rule – large loss review • System Reminder – update case reserves • Correspondence Tracking – legal letter sent
Semantic processing: Named Entity Extraction • Identify and type language features • Examples: • People names • Company names • Geographic location names • Dates • Monetary amount • Phone #, zipcodes, SSN, FEIN • Others… (domain specific) 24
Feedback to UW 25
Data Quality: Opportunities, Data, and Examples 26
- Quality assurance vs quality control
- Concept of quality assurance
- Opportunities and challenges in media
- Observing trends in entrepreneurship example
- Quality policy
- Pmp quality management
- Pmbok quality assurance vs quality control
- Quality assurance cycle in nursing
- Compliance vs quality
- Two quality gurus
- Quality is free
- Old quality vs new quality
- Data cleaning problems and current approaches
- Data quality and data cleaning an overview
- Data quality and data cleaning an overview
- Data quality kpi examples
- Secondary content analysis
- Aristotelian standpoint
- Actions to address risks and opportunities
- Swot wo
- Industry types and the opportunities they offer
- Generation of a new entry opportunity
- Factbranch
- Greater bay area opportunities and challenges
- Opportunities and threats of a teacher
- Swot analysis example for teachers
- An opportunity has four essential qualities it is