Proposal to Update Data Management of Genomic Summary

  • Slides: 32
Download presentation
Proposal to Update Data Management of Genomic Summary Results Under the NIH Genomic Data

Proposal to Update Data Management of Genomic Summary Results Under the NIH Genomic Data Sharing Policy Dina Paltoo, Ph. D. , M. P. H. NIH Office of Science Policy Laura Lyman Rodriguez, Ph. D. National Human Genome Research Institute October 4, 2017

Housekeeping Participants must register to join the webinar Email questions for panelists to gds@mail.

Housekeeping Participants must register to join the webinar Email questions for panelists to gds@mail. nih. gov No verbal questions will be accepted Webinar audio will be recorded Recording and slides posted after the webinar on the Office of Science Policy’s webpage: https: //osp. od. nih. gov/scientific-sharing/genomicdata-sharing/ 2

Benefits of Data Sharing § Enables data generated from one study to be used

Benefits of Data Sharing § Enables data generated from one study to be used to explore a wide range of additional research questions § Increases statistical power and scientific value by enabling data from multiple studies to be combined § Facilitates reproducibility and validation of research results § Facilitates innovation of methods and tools for research 3

NIH’s Culture of Data Sharing White House Initiative (2013 “Holdren Memo”) Genome-wide Association (GWAS)

NIH’s Culture of Data Sharing White House Initiative (2013 “Holdren Memo”) Genome-wide Association (GWAS) Policy NIH Data Sharing Policy Model Organism Policy Research Tools Policy Big Data to Knowledge (BD 2 K) Initiative Genomic Data Sharing (GDS) Policy NIH Public Access Policy (Publications) HHS Rule and NIH Policy on Clinical Trial Results NIH Intramural Human Data Sharing Policy Modernization of NIH Clinical Trials 1999 2003 2004 2007 2008 2012 2014 2015 2017 4

Seeking Appropriate Balance Natural tension between values: – Protect and respect participants – Promote

Seeking Appropriate Balance Natural tension between values: – Protect and respect participants – Promote health advances through research 5

Guiding Principle of the NIH Genomic Data Sharing Policies The greatest public benefit will

Guiding Principle of the NIH Genomic Data Sharing Policies The greatest public benefit will be realized if large-scale genomic data are made available in a timely manner to the largest possible number of investigators. For human data, data are made available under terms and conditions consistent with the informed consent provided by individual participants. 6

NIH Genomic Data Sharing Policy Purpose – Sets forth expectations and responsibilities for investigators

NIH Genomic Data Sharing Policy Purpose – Sets forth expectations and responsibilities for investigators and their institutes that ensure the broad and responsible sharing of genomic research data in a timely manner Scope – All NIH-funded research generating large-scale human or non-human genomic data and the use of these data for subsequent research – Applies to all funding mechanisms (grants, contracts, intramural support) and there is no minimum threshold for cost Data Sharing – Non-human data: made available through current databases and resources remain standard mechanism; any widely used data repository (e. g. , Gen. Bank, SRA, ZFIN) – Human data: studies with data derived from human specimens registered in db. Ga. P 7

Unrestricted- vs Controlled-access to Human Genomic Data Informed consent is the basis for institutions

Unrestricted- vs Controlled-access to Human Genomic Data Informed consent is the basis for institutions to determine the appropriateness of submitting human data to unrestricted- or controlled-access repositories Unrestricted-access: data are publicly available to anyone (e. g. , The 1000 Genomes Project) Controlled-access: investigators must obtain approval from NIH Data Access Committees to use the requested data (e. g. , db. Ga. P) 8

NIH Repository for Human Genomic Data After the effective date of the NIH GDS

NIH Repository for Human Genomic Data After the effective date of the NIH GDS Policy: Unless informed consent explicitly states unrestrictedaccess to individual-level data is appropriate, human genomic data and associated phenotypic data should be made available through controlled-access (e. g. , NIH database of Genotypes and Phenotypes (db. Ga. P) 9

db. Ga. P Established in 2007 Controlled-access repository for the sharing of human genomic

db. Ga. P Established in 2007 Controlled-access repository for the sharing of human genomic and associated phenotypic data under the NIH genomic data sharing policies db. Ga. P web portal to register studies, submit data, submit Data Access Requests (DARs) – Datasets are organized and distributed according to their data use limitations 10

Data Submission, Access, and Use Statistics 826 = Number of studies in db. Ga.

Data Submission, Access, and Use Statistics 826 = Number of studies in db. Ga. P 1, 600, 000+ = Number of participants represented in db. Ga. P 5, 344 = Number of PIs requesting data 46 = Number of PI countries 1500+ = Number of publications resulting from secondary use of db. Ga. P data NIH Institutes and Centers Sponsoring db. Ga. P Studies (%) Data Access Requests 50000 40000 13 days = Average Data Access Request processing time 50, 167 Submitted 30000 20000 34, 16 Approved 10000 0 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 As of July 1, 2017 11

db. Ga. P Data Submission and Access Data Submission • • • Data Access

db. Ga. P Data Submission and Access Data Submission • • • Data Access Large-scale human genomic and associated phenotypic data Institutional Certification and data use limitations specified Data de-identified All Potential Users Data Access Request • • db. Ga. P • • Study protocol • Descriptive information Unrestricted. Access Co-signed by institution Agree to terms of use (Data User Certification agreement) Agree to Genomic User Code of Conduct • Coded Genotypes • Phenotypes • Genomic Summary Results Controlled. Access Data Access Committee • Review data use limitations 12

What are “Genomic Summary Results”? Term has been referred to in various ways –

What are “Genomic Summary Results”? Term has been referred to in various ways – Aggregate Genomic Data – Genomic Summary Statistics Genotype counts, allele frequencies, p-values, & effect size estimates and standard errors 13

§ Analytical framework for accurately resolving whether an individual is present in a genomic

§ Analytical framework for accurately resolving whether an individual is present in a genomic DNA mixture, if sequence is also known § Authors demonstrate probability to assess whether a person or relative participated in a GWAS study § Genomic summary results managed in unrestricted-access tier in db. Ga. P § Privacy concerns 14

NIH Rationale for Move to Controlled-Access National Institutes of Health Modifications to Genome-Wide Association

NIH Rationale for Move to Controlled-Access National Institutes of Health Modifications to Genome-Wide Association Studies (GWAS) Data Access August 28, 2008 The National Institutes of Health (NIH) has modified part of our current policy for data posting and access to genomic data contained within NIH GWAS database. … … New statistical techniques for analyzing dense genomic information make it possible to infer the group assignment (i. e. , case or control) of an individual DNA sample if one has access to highdensity genomic … and the allele frequencies for the case and control groups from publicly available aggregate datasets. To address any concerns that may arise related to the possibility of inferring group association from aggregate, publicly available GWAS data, NIH has taken the following preemptive actions: • We have removed aggregate genotype data for GWAS studies from public access, but may make them available through the controlled access DAR/DAC process. … NIH will be working with the wide range of stakeholders related to genomic data sharing over the coming months to further explore and address the policy implications of this finding. 15

Meanwhile Out in the World… 16

Meanwhile Out in the World… 16

Since 2008…Where Are We Now? § § The Power of Statistics Accessibility of Technology

Since 2008…Where Are We Now? § § The Power of Statistics Accessibility of Technology Accessibility of Data Informational Risk § § Participant Right to Benefit Proportional Controls Broad Consent Focus on Data Sharing 17

Re-considering the Balance Touched upon in 2012 NIH workshop NHGRI held focused workshop in

Re-considering the Balance Touched upon in 2012 NIH workshop NHGRI held focused workshop in 2016 – Consider risks and benefits through current lens – Included bench/clinical researchers, ethics, and participant stakeholders Several key discussion points – Language matters: summary statistics (findings) – Value to advance knowledge is only increasing – Lack of information about what “it” is contributes to concerns – Transparency in any use is critical – Must allow for sensitive circumstances 18

Workshop Findings & Recommendations Value of genomic summary results is substantial and broad Privacy

Workshop Findings & Recommendations Value of genomic summary results is substantial and broad Privacy risks are distinct from individual-level data Privacy harms will differ based on what attached information is revealed Should encourage NIH to reconsider its access model In doing so, the public should be engaged in discussion of risks & benefits and alternate models. https: //www. genome. gov/pages/policyethics/genomicdata/aggdatareport. pdf 19

NIH Response Winter 2017 Request For Information on db. Ga. P Question 3: Policies

NIH Response Winter 2017 Request For Information on db. Ga. P Question 3: Policies for Management and Use of db. Ga. P Data [CATEGORYNAME] benefits of different [PERCENTAGE NAME] Bioethicist ] [PERCENTAGE] Government 0% – Risks and management models for genomic summary statistics related 2%to participant privacy and/or scientific opportunity for its broad Official Scientific use 4% Researcher 60% – Alternative options for providing access to genomic summary statistics beyond unrestricted or. Institutional controlled-access models (e. g. , registered access) Official – Factors to consider 15% in determining the risk-benefit balance for specific datasets (e. g. , those including sensitive information or vulnerable populations) – Methods for mitigating risks associated with unrestricted access to genomic summary statistics 20

Overview of Response Majority of Respondents felt that: – Summary statistics should be moved

Overview of Response Majority of Respondents felt that: – Summary statistics should be moved to unrestricted access – Should implement measures to mitigate risk • Retain controlled access for information derived from vulnerable and small populations, studies involving sensitive phenotypes • Require completion of an educational module on appropriate use • Record queries to track users of summary statistics from sensitive studies – The research community, and NIH, should engage with participant communities and the public regarding policy Other comments discussed benefit of harmonizing summary statistics to enable analysis across studies 21

Proposed Update NIH-Designated Genomic Data Repository Unrestricted Access Genomic & Phenotype Data Institutional Certification

Proposed Update NIH-Designated Genomic Data Repository Unrestricted Access Genomic & Phenotype Data Institutional Certification from Submitting Institution Effective upon publication of final update Controlled Access Study Protocol Descriptive Information Genomic Summary Results (GSR) from most studies Coded Genotypes Phenotypes Summary Results from sensitive studies Users should agree to abide by basic elements of responsible behavior NIH should develop informational resources on GSR and their use for investigators and the public for posting on NIH-designated repositories 22

Defining Genomic Summary Results Genomic & Phenotype Data Rapid Access Genomic Summary Results from

Defining Genomic Summary Results Genomic & Phenotype Data Rapid Access Genomic Summary Results from most studies genotype counts, allele frequencies, p-values, & effect size estimates and standard errors Determine methods for generating GSR from collection based on community best practices Publicly post methods to be used on db. Ga. P Update as appropriate 23

Stimulating a Responsible Use Culture Genomic & Phenotype Data Rapid Access Genomic Summary Results

Stimulating a Responsible Use Culture Genomic & Phenotype Data Rapid Access Genomic Summary Results from most studies Users must affirm intent to use information responsibly – Reviewed NIH-provided informational resources – Will not attempt to re-identify or contact individual participants or groups, or generate information that could allow individual identities to be readily ascertained – Will promote scientific research or health through any use of genomic summary results 24

Proposed Access for Sensitive Studies NIH-Designated Genomic Data Repository Genomic & Phenotype Data Institutional

Proposed Access for Sensitive Studies NIH-Designated Genomic Data Repository Genomic & Phenotype Data Institutional Certification from Submitting Institution Controlled Access Coded Genotypes Phenotypes Summary Results from sensitive studies 25

“Sensitive” Studies Defined As: Potentially stigmatizing traits such as – – Substance abuse HIV/AIDS

“Sensitive” Studies Defined As: Potentially stigmatizing traits such as – – Substance abuse HIV/AIDS Sexual attitudes, preferences, or practices Others identified by Submitting Institution Potentially Vulnerable Populations such as – – Rare disease communities Small sample sizes Isolated or identified geographic regions Native Americans/Alaska Natives or other indigenous populations 26

Proposed Implementation NIH-Designated Genomic Data Repository Genomic & Phenotype Data Institutional Certification from Submitting

Proposed Implementation NIH-Designated Genomic Data Repository Genomic & Phenotype Data Institutional Certification from Submitting Institution Controlled Access Coded Genotypes Phenotypes Summary Results from sensitive studies Institutional Certification Form updated upon publication of final Update 27

Proposed Implementation Publication of Final Update FUTURE STUDIES EXISTING STUDIES ICs can allow reasonable

Proposed Implementation Publication of Final Update FUTURE STUDIES EXISTING STUDIES ICs can allow reasonable extensions if requested GSR Effective Date Submitting Inst. have 6 months to classify data sets as “sensitive” Submitting Inst. will designate sensitive data sets Consent forms should discuss plans for GSR access 28

Proposed Implementation Effective upon publication of final update – Updated Institutional Certifications available for

Proposed Implementation Effective upon publication of final update – Updated Institutional Certifications available for immediate use by Submitting Institutions Prospective recruitment should include discussion of genomic summary results and access model Studies already in NIH-designated repositories – Submitting Institutions have 6 months to indicate Sensitive Studies or need for additional time – If institution affirms unrestricted access appropriate, GSR moved to unrestricted access 29

Topics for Public Feedback Questions for Comment: – Risks and benefits to broad sharing

Topics for Public Feedback Questions for Comment: – Risks and benefits to broad sharing of most GSR including use of the click-through agreement – Risks and benefits to maintaining GSR from Sensitive Studies in controlled access – Method for designating studies as Sensitive – Comments on any aspect of the proposed update 30

Questions Email Questions to: gds@mail. nih. gov 31

Questions Email Questions to: gds@mail. nih. gov 31

Thank You & Additional Resources Websites: https: //osp. od. nih. gov/scientificsharing/genomic-data-sharing/ and http: //osp.

Thank You & Additional Resources Websites: https: //osp. od. nih. gov/scientificsharing/genomic-data-sharing/ and http: //osp. od. nih. gov/ For General Inquiries: gds@mail. nih. gov Subscribe to LISTSERVs: GDS: GENOMIC-DATA-SHARING-GDS-L OSP: LISTSERV@list. nih. gov (with the message: Subscribe OSP_News) Learn more about the Office of Science Policy from our blog “Under the Poliscope” http: //osp. od. nih. gov/under-thepoliscope 32