Matching Data for EHDI Tracking Program Cathy Gunderson

  • Slides: 32
Download presentation
Matching Data for EHDI Tracking Program Cathy Gunderson EHDI/NEST Project Manager Colorado Department of

Matching Data for EHDI Tracking Program Cathy Gunderson EHDI/NEST Project Manager Colorado Department of Public Health and Environment Denver, Colorado 9/15/2020 1

Faculty Disclosure Information In the past 12 months, I have not had a significant

Faculty Disclosure Information In the past 12 months, I have not had a significant financial interest or other relationship with the manufacturer(s) of the product(s) or provider(s) of the service(s) that will be discussed in my presentation This presentation will (not) include discussion of pharmaceuticals or devices that have not been approved by the FDA or if you will be discussing unapproved or "off-label" uses of pharmaceuticals or devices. 9/15/2020 2

De-Duplicating Person Data in a Centralized Database Cathy Gunderson, Project Manager

De-Duplicating Person Data in a Centralized Database Cathy Gunderson, Project Manager

Colorado’s Title V Program

Colorado’s Title V Program

Overview n n Overview of EHDI/NEST Project Person De-duplication Process n n n 9/15/2020

Overview n n Overview of EHDI/NEST Project Person De-duplication Process n n n 9/15/2020 SOUNDEX Special considerations Scoring 5

Overview of the EHDI/NEST Project

Overview of the EHDI/NEST Project

A few FLA’s n n n n (Four/Five Letter Acronyms) CDPHE – Colorado Department

A few FLA’s n n n n (Four/Five Letter Acronyms) CDPHE – Colorado Department of Public Health and Environment EHDI – Early Hearing Detection and Intervention NEST – Newborn Evaluation, Screening and Tracking CHIRP – Clinical Health Information Records of Patients CSHCN – Children with Special Health Care Needs – In Colorado: HCP – Title V HCP – Health Care Program for Children with Special Needs (HCP) CRCSN – Colorado Responds to Children with Special Needs – Birth Defects registry 9/15/2020 7

Project Goals n n Develop a comprehensive statewide EHDI program from screening to intervention

Project Goals n n Develop a comprehensive statewide EHDI program from screening to intervention Implement a system that has a database that integrates information from NBH with PKU and Sickle Cell Disease Create and maintain a centralized database which will help prove the efficacy of NBS Implement the system 9/15/2020 8

Colorado’s Newborn Screening n Colorado screens for 8 conditions n n Hemoglobinopathies – Sickle

Colorado’s Newborn Screening n Colorado screens for 8 conditions n n Hemoglobinopathies – Sickle Cell Inherited Metabolic Diseases: n n n Cystic Fibrosis Endocrine Diseases: n n Phenylketonuria (PKU) Galactosemia Biotinidase Deficiency Hypothyroidism Congenital Adrenal Hyperplasia (CAH) Newborn Hearing Tandem Mass Spectrometry - 2006 9/15/2020 9

Person De-Duplication

Person De-Duplication

The BIG Picture 9/15/2020 11

The BIG Picture 9/15/2020 11

First Step–Understand your Data n Electronic Birth Certificate (EBC) n n Laboratory Services Division

First Step–Understand your Data n Electronic Birth Certificate (EBC) n n Laboratory Services Division (LSD) n n Reported by clerks at birthing hospitals Reported by clerks at CDPHE for non-birthing hospitals Reporting is required within 10 days of birth State Laboratory that processes blood spot screening Forms mailed to LSD, processed within 24 hours Results reported within 3 days of receipt Transactions from other agencies 9/15/2020 12

Data from the Electronic Birth Certificate (Vital Records) n n n Unique identifier is

Data from the Electronic Birth Certificate (Vital Records) n n n Unique identifier is Birth Certificate Number Data are not 'cleaned' yet May be duplicates if hospital sends information more than once Fields exist for NBH screening results, already associated with the newborn Newborn information for babies born out of state, born in transit or born at home as well as in birthing hospitals 9/15/2020 13

Data from EBC (cont. ) Daily: n EBC processed the night before Weekly: n

Data from EBC (cont. ) Daily: n EBC processed the night before Weekly: n Infant death records n Voided Records Annually: n Correction tape for resident county 9/15/2020 14

Data from the Newborn Metabolic Screen (State Lab) Daily: n Unique identifier is accession

Data from the Newborn Metabolic Screen (State Lab) Daily: n Unique identifier is accession number and form number n Data are final results from each screen n Demographic data on baby and mother n Information on hospital and doctor n Second screen may/may not have original form number (may have been lost) n Second screen may have new doctor 9/15/2020 15

Transaction Data n n Input from any CHIRP or CHIRP-like application Standard Format n

Transaction Data n n Input from any CHIRP or CHIRP-like application Standard Format n n Based on different type of event, i. e. , birth, Dx, communication, status change Data sent out from NEST in same format 9/15/2020 16

Second Step – Process the Data Daily: n Validate the data n Validate a

Second Step – Process the Data Daily: n Validate the data n Validate a unique identifier in input n n n n Must be the same person Un-duplicate - SOUNDEX routine Assign unique identifier: NEST_PID Retain/record original EBC data Retain/record original lab data Retain/record original transaction data Record all screening results (activities) 9/15/2020 17

De-Duplication Routine n If a potential unique number is received: n n n Verify

De-Duplication Routine n If a potential unique number is received: n n n Verify that it is the same person If not, it is an exception Unique Numbers: n n 9/15/2020 SSN – most babies don’t have one yet EBC Blood spot form number Accession number combined with date 18

SOUNDEX: Find Potential Matches n n Find best selections for matching based on SOUNDEX

SOUNDEX: Find Potential Matches n n Find best selections for matching based on SOUNDEX keys Base on the type of data you receive n n Some data better than from other sources Reliability of data coming in n 9/15/2020 EBC considered ‘most right’ 19

Build a SOUNDEX Key for Input n n n n Treat as all lower

Build a SOUNDEX Key for Input n n n n Treat as all lower case If first two letters are ei or ai change to i Change all c to k Change all ch to k Change all ph to f Change all z to s Change all y to i 9/15/2020 20

n n n n Remove all duplicated letters Remove all special characters (‘. -.

n n n n Remove all duplicated letters Remove all special characters (‘. -. And spaces) Keep first letter of each name part Remove all vowels after 1 st letter Use first 4 remaining letters for last name Use first 3 remaining letters for first name Use middle initial Put DOB in CCYYMMDD order 9/15/2020 21

Special Considerations n Hispanic Surnames n Can be a composite: n n Father’s last

Special Considerations n Hispanic Surnames n Can be a composite: n n Father’s last name Mother’s last name A composite name will be treated as 3 last names Marital Status or Insurance restrictions n LAB under Mother’s last name at birth n n n 9/15/2020 Unmarried mom Married mom but insurance still under maiden name EBC under Father’s last name 22

Added Considerations n SOUNDEX might select a candidate, but no score for a match

Added Considerations n SOUNDEX might select a candidate, but no score for a match on actual data n n Lopes Gonzalez Gomez Lopez Gonzales Gomes We allow for points on a SOUNDEX match 9/15/2020 23

Example: SOUNDEX Routine 9/15/2020 24

Example: SOUNDEX Routine 9/15/2020 24

SOUNDEX Key Types n 5 Key Types: n n n 9/15/2020 A) Last. Name

SOUNDEX Key Types n 5 Key Types: n n n 9/15/2020 A) Last. Name First. Name Middle. Init Gender DOB (YYMM) B) Last. Name Gender DOB (YYMMDD) TOB (HHMM) C) Last. Name First. Name DOB (YYMM) D) Last. Name SOUNDEX E) First. Name SOUNDEX 25

SOUNDEX Keys n n UP to 24 Different Values for the 5 Types Example:

SOUNDEX Keys n n UP to 24 Different Values for the 5 Types Example: Last. Name n n n n n 9/15/2020 Child’s Last Name Child’s AKA Last Name Child’s Last Name Part 1 Child’s Last Name Part 2 Mother’s Last Name Part 1 Mother’s Last Name Part 2 Father’s Last Name Part 1 Father’s Last Name Part 2 26

Average Number of Keys n n No child has all 24 keys If Child

Average Number of Keys n n No child has all 24 keys If Child and Mom and Dad all have same last name, Key is only created once – no duplicate SOUNDEX Keys for a child Missing Data On average, each of our children have 12 keys. 9/15/2020 27

Scoring Routine n n After a potential match is found, individual fields are compared

Scoring Routine n n After a potential match is found, individual fields are compared and points awarded for matches Actual Data Fields are compared : n n n 9/15/2020 Last name, first name, middle name, DOB, TOB Mother’s last name, first name, maiden name, DOB AKA names Father’s last name, first name, DOB Any unique identifiers recorded with the input and on the database (i. e. , Birth Cert #, NBS form #, etc. ) 28

Scoring Routine (cont. ) n n n A score above a certain threshold indicates

Scoring Routine (cont. ) n n n A score above a certain threshold indicates the same person – same NEST_PID assigned A score under a certain threshold indicates a different person – new NEST_PID is generated A score between those thresholds cannot be determined by the application and will need human intervention to determine 9/15/2020 29

Fine Tuning the De-duplication Routine n Make adjustments to the thresholds n n n

Fine Tuning the De-duplication Routine n Make adjustments to the thresholds n n n Too many duplicates being added Make adjustments to the points awarded for matches Twins! Use birth type and order ? Take away points for no matches ? What if MI present on one and not on another – some points? Constant vigilance! 9/15/2020 30

Human Intervention n n Can help fine tune the De-duplication Routine Three options: n

Human Intervention n n Can help fine tune the De-duplication Routine Three options: n Override: n Add as a new person n Indicate a match and update information n Resubmit n Thread 9/15/2020 of processing / timing / Twins! 31

Colorado Contact Cathy Gunderson, EHDI/NEST Project Manager Colorado Department of Public Health and Environment

Colorado Contact Cathy Gunderson, EHDI/NEST Project Manager Colorado Department of Public Health and Environment FCHSD-HCP-A 4 4300 Cherry Creek Drive South Denver, CO 80246 -1530 cathy. gunderson@state. co. us 303 -692 -2145 9/15/2020 32