Harmonising primary care data using international standard vocabularies
Harmonising primary care data using international standard vocabularies for observational research School of Public Health and Community Medicine Sanjay FARSHID 1 Jitendra JONNAGADDALA 1, 2 Guan GUO 1 Mike WU 1 Siaw-Teng LIAW 1, 2, 3 1. School of Public Health and Community Medicine, UNSW Sydney 2. WHO Collaborating Centre for e. Health, UNSW Sydney, Australia 3. General Practice Unit, Fairfield Hospital, Sydney South West Area Health Service, Sydney, Australia Introduction Creation of specifications document (tables, columns, values) of e. PBRN database Vocabulary mappers (VM 1, VM 2) and vocabulary quality assurance (VQA): Training and guideline development VM 1 maps Software 1 related tables in Usagi VM 2 maps Software 2 related tables in Usagi VM 1 reviews VM 2 mappings VM 2 reviews VM 1 mappings Commence new cycle Meeting with entire team to resolve remaining discrepancies VQA reviews final mappings • Data manually entered by doctors, such as demographic information, was often missing. • Most terminology lacked standardisation, due to differences between source EHR systems and record keeping practices across the GP sites. Many concepts therefore required manual mapping. • Australian reimbursement claims do not require diagnosis codes, rendering diagnoses difficult to extract via clinical notes. • Regardless of these barriers, most concepts were successfully mapped to standard terminology. Figure 2 | Vocabulary mapping process 100 sp ec m pt er ns u Pr ov id co tte re ig a Box 1 | OMOP strengths and weaknesses ia io n ic ity hn Concepts entered lty 0 Et Mappings will need to be periodically updated 20 Mapping success C ü Has been successfully used in large-scale studies Proportion of concepts (%) 40 r ü Original concept names preserved as ‘source’ values Complete granularity of source data may not be captured 60 de Standard vocabularies do not yet have 100% coverage en ü Superior or comparable to other data models in content coverage, integrity, standards compatibility and usability 3 80 G • Improvements in health informatics have allowed large-scale data from electronic health records (EHRs) to be linked analysed to synthesise real world evidence. 1 • The ability to perform such research, however, is often limited by heterogeneity of datasets. • To resolve this issue, the Observational Medical Outcomes Partnership (OMOP) has developed a common data model that enables researchers to use the same analytic methods on data from different sources which adopt its format. 2 • The electronic practice-based research network (e. PBRN) is a repository of general practice, hospital admissions and claims data from two health neighbourhoods in South Western Sydney. • Harmonisation of this data repository with standard vocabularies would enable participation in large, international studies. Discussion Start Source of concepts Figure 4 | Manual mapping success rates Results Figure 1 | e. PBRN and the OMOP CDM Aims • To harmonise content from the e. PBRN data repository with semantic standards from OMOP CDM standard vocabularies. • To investigate the suitability of Australian healthcare data for integration with international semantic standards. • Standard vocabulary codes were already assigned to some concepts related to medications and diagnosis, allowing automatic mapping (Figure 3). • The remaining concepts required manual mapping. Figure 4 shows the proportion of these concepts which contained a value compared to blank values, and the proportion of remaining concepts which were successfully mapped. • Standard vocabularies were used for all entities except for ethnicity, where a custom vocabulary which included Indigenous Australian ethnicities was required. • A large number of concepts was generated from free text entries in EHR software, making vocabulary mapping time-consuming. Methods • We used an extract of routinely entered data obtained from 13 general practices in the e. PBRN repository. • Where possible, existing codes were used to automatically map source concepts to OMOP standard concepts. • Manual vocabulary mapping was undertaken on the remaining concepts using Usagi version 1. 1. 6. Future research This is an ongoing project. We intend to harmonise the entire e. PBRN with OMOP vocabulary. We are evaluating the quality and usability of the mappings, and a number of studies plan to use the transformed data to examine issues such as continuity of care and doctor shopping. Conclusions Harmonising an EHR-derived data repository with a common data model is feasible. Changes in record keeping practices can facilitate this process and allow EHR data to become a more valuable source of health information. References 1. Medication 2. Diagnosis 0 Software 1 20 40 60 80 100 Concepts automatically mapped (%) Figure 3 | Automatically mapped concepts 3. Sherman RE, Anderson SA, Dal Pan GJ, Gray GW, Gross T, Hunter NL, et al. Real-world evidence: What is it and what can it tell us? New England Journal of Medicine. 2016; 375(23): 2293 -7. Stang PE, Ryan PB, Racoosin JA, Overhage M, Hartzema AG, Reich C, et al. Advancing the science for active surveillance: Rationale and design for the observational medical outcomes partnership. Annals of Internal Medicine. 2010; 153(9): 600 -6. Garza M, Del Fiol G, Tenenbaum J, Walden A, Zozus MN. Evaluating common data models for use with a longitudinal community registry. Journal of Biomedical Informatics. 2016; 64: 333 -41. Contact: sanjay. farshid@gmail. com PRESTENTED AT HIC 2018 SYDNEY
- Slides: 1