The MURDOCK Study Selfreported and EHRderived Phenotypes Supporting

- Slides: 1
The MURDOCK Study: Self-reported and EHR-derived Phenotypes Supporting Biomarker Discovery Douglas Wixted 1, Meredith L. Nahm 1, Michelle Smerek 1, Anita Walden 1, Jessica Tenenbaum 1, Ashley Dunham 1, Carl F. Pieper 2, Melissa Cornish 1, Victoria Christian 1, Rowena Dolor 3, L. Kristin Newby 3, Robert M. Califf 1 1 Duke “Rewriting the textbook of medicine” Translational Medicine Institute, 2 Duke Department of Biostatistics and Bioinformatics, 3 Duke Clinical Research Institute Background The MURDOCK Community Registry and Biorepository (“the Registry”) is a key component of the MURDOCK Study, an ongoing initiative to reclassify disease based on underlying molecular mechanism. Based in Cabarrus County, NC, the Registry aims to enroll 50, 000 participants (~9, 000 to date) from the catchment area depicted in Figure 1. The Registry will generate a large sample size of well-annotated biospecimens (blood and urine) paired with environmental, demographic and clinical characteristics. Participation entails self-reported information, annual follow-up, access to electronic health records, and permission to re-contact. This valuable data resource is available for translational research collaborations. Figure 1: The MURDOCK Study catchment area. Multi-source Phenotype Identification Substantial effort has been made by research initiatives such as e. MERGE, SHARPn, and Mini. Sentinel to develop validated and portable algorithms to transform raw EHR data into phenotypes. MURDOCK will collaborate with and leverage related efforts, however, MURDOCK diverges from these efforts in key aspects of both input and goals: Table 1: Self-reported medical history is focused on 34 diseases and medical conditions. Data Collection At enrollment, participants provide self-reported medical history, quality of life measures, socioeconomic information, lifestyle and nutrition via a questionnaire. Updated medical information is collected thereafter through annual follow-up questionnaires and consented access to participants’ electronic health records (EHRs). Self-reported baseline and annually updated medical history focus on the 34 medical conditions depicted in Table 1. EHR data are procured through partnership with local healthcare provider organizations. Phenotypes or clinical characteristics to complement genomic analyses can leverage data from both sources. Clinical variables derived both from participant selfreporting and EHRs represents a significant advantage since the quality of individual sources can vary widely. • Incomplete EHR data integrated from multiple source systems additional complexity. • “Complete” EHR data may not be necessary for the purpose of corroborating self-reported data. A pilot study is being conducted to compare sources and methods to identify and analyze discrepancies. Contact Information Douglas Wixted, MMCI douglas. wixted@duke. edu 919 -668 -0503 Jessica Tenenbaum, Ph. D jessie. tenenbaum@duke. edu 919 -668 -8811 Acknowledgements The MURDOCK Study is funded by the David H. Murdock Institute for Business and Culture and the Duke Figure 4: Common disease prevalence to date based on selfreport (N = 9112). The authors invite interested investigators to take full advantage of this rich resource by contacting the MURDOCK Study team (murdockstudy@duke. edu) to explore opportunities for collaboration. CTSA grant (UL 1 RR 024128). Authorship represents MURDOCK Community Registry and Biorepository informatics and study leadership. Disclosure Information Authors listed here have nothing to disclose concerning possible financial or personal relationships with commercial entities that may have a direct or indirect interest in the subject matter of this presentation. For more information on the MURDOCK Study, please visit us online: www. murdock-study. com Figure 2: Clinical variables can be based on 1) self-report, 2) self-report supported by EHR data, and/or 3) phenotype algorithms. A pilot study will compare data sources and methods.