THE DATA MANAGEMENT PLAN DMP AND YOUR NSF

  • Slides: 22
Download presentation
THE DATA MANAGEMENT PLAN (DMP) AND YOUR NSF PROPOSAL RALEIGH L. MARTIN - AAAS

THE DATA MANAGEMENT PLAN (DMP) AND YOUR NSF PROPOSAL RALEIGH L. MARTIN - AAAS S&T POLICY FELLOW, NSF GEOSCIENCES AUGUST 2, 2019 NSF EARLY CAREER WORKSHOP This presentation does not reflect an official position of AAAS or of the U. S. National Science Foundation.

INTRODUCTION: WHAT IS A DATA MANAGEMENT PLAN (DMP)? • Plan for generating, sharing, and

INTRODUCTION: WHAT IS A DATA MANAGEMENT PLAN (DMP)? • Plan for generating, sharing, and archiving project data • Required 2 page supplementary document for all NSF proposals (single DMP across collaborative) • NSF PAPPG (Proposal & Awards Policies & Procedures Guide) gives general policy, GEO Divisions & Programs define specifics Data collection overview Dataset 1: • Data type • Formats & standards • Access & sharing • Policies for reuse • Archiving & preservation Dataset 2 Dataset 3. . . Sample template for

PRESENTATION OVERVIEW I. Preparing the DMP for your proposal II. Executing the DMP for

PRESENTATION OVERVIEW I. Preparing the DMP for your proposal II. Executing the DMP for your awarded project III. Additional considerations for DMPs

I. PREPARING THE DMP FOR YOUR PROPOSAL A. Outline your DMP – What datasets

I. PREPARING THE DMP FOR YOUR PROPOSAL A. Outline your DMP – What datasets to be generated? Where and how to manage them? B. Plan for each data object – Data standards, access timelines, preservation C. Integrate DMP with your broader proposal – Feasibility of management, budget, etc.

PREPARING THE DMP FOR YOUR PROPOSAL A. Outline your DMP – What datasets to

PREPARING THE DMP FOR YOUR PROPOSAL A. Outline your DMP – What datasets to be generated? Where and how to manage them? B. Plan for each data object C. Integrate DMP with your broader proposal

Data collection Dataset 1: Raw sensor overview A. OUTLINE YOUR DMP 1. List out

Data collection Dataset 1: Raw sensor overview A. OUTLINE YOUR DMP 1. List out the data to be generated – e. g. , raw data, processed data, software, physical samples, model output, curricula, . . . 2. Determine the value of each data object – What is required to support publications? What is valuable for long-term reuse? What can be discarded? 3. Identify resources for managing data – e. g. , university server for short-term, disciplinary data repository for long-term preservation of specialized data, general purpose repository for records • Only needed for shortterm analysis • Store on university Dataset server 2: Analysis software • Necessary for reproducibility • Store in generalpurpose repository Dataset 3: Processed dataset • Valuable for long-term reuse • Store on disciplinary data repository

WHAT “DATA” SHOULD BE SHARED? “Investigators are expected to share with other researchers, at

WHAT “DATA” SHOULD BE SHARED? “Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. ” • Policy does NOT require that all data be shared forever • Exceptions/accommodations for “privileged or confidential information” • Specific division data policies and/or program solicitations provide further details

WHERE CAN THE DATA GO? • “Domain” repository (e. g. , Arctic Data Center,

WHERE CAN THE DATA GO? • “Domain” repository (e. g. , Arctic Data Center, Biological and Chemical Oceanography Data Management Office, Hydroshare) – This is ideal if your data fit within the discipline of the repository, and it may be required for certain programs. • Institutional repository (e. g. , university, museum) – This may be required by your university of research institution. • General repository (e. g. , Figshare, Zenodo, Dryad) – A good backup option. • Journal article: • • Tables and figures – good for very high-level information, but not machine readable Article supplements – not advised: very difficult to access, often behind paywall

NSF/GEO DIVISION-SPECIFIC REQUIREMENTS Division / Policy # Maximum allowable period for data release “Data”

NSF/GEO DIVISION-SPECIFIC REQUIREMENTS Division / Policy # Maximum allowable period for data release “Data” definition Data repositories AGS: Atmospheric and Geospace Sciences (2018) 2 years after project completion Primary data, samples, physical collections and other supporting materials None specified EAR: Earth Sciences (2018) 2 years after data collection Observational datasets, derived data products, software, physical collections EAR-wide suggested list OCE: Ocean Sciences (NSF 17 -037) 2 years after data collection Metadata files, full data sets, derived OCE-wide preferred list data products, software, Program-specific guidelines physical collections OPP: Office of Polar Programs (NSF 16 -055) Default - Earlier of 2 years or project end AON - Immediately ASSP– 5 years Not specified Note that many GEO programs specify additional data requirements through their solicitations. Arctic – Arctic Data Center (metadata), AON, ASSP Antarctic – USAP Data Coordination Center Directorate for Geosciences —Data Policies (https: //www. nsf. gov/geo -data-policies/)

PREPARING THE DMP FOR YOUR PROPOSAL A. Outline your DMP B. Plan for each

PREPARING THE DMP FOR YOUR PROPOSAL A. Outline your DMP B. Plan for each data object – Data standards, access timelines, preservation C. Integrate with your broader proposal

B. PLAN FOR EACH DATA OBJECT IN YOUR DMP NSF PAPPG (Section II. C.

B. PLAN FOR EACH DATA OBJECT IN YOUR DMP NSF PAPPG (Section II. C. j) – suggested elements for DMP 1. Type of data to be produced (e. g. , raw data, analyzed data, model outputs, software, physical samples, curricula) 2. Data and metadata standards to be used (i. e. , file formats, disciplinary standards, coding language, etc. ) 3. Policy for access and sharing (i. e. , repository selection, protocol for access and citation, timeline of availability) 4. 5. Policy for reuse and distribution (i. e. , licenses, reuse limitations) Plan for archiving and preservation (i. e. , forever or a finite period? )

EXAMPLE: PROCESSED DATASET Project: Integrating sensor data to understand wind-driven sediment transport Wind Sedime

EXAMPLE: PROCESSED DATASET Project: Integrating sensor data to understand wind-driven sediment transport Wind Sedime nt flux Data collection overview Dataset 1: Raw sensor records Dataset 2: Analysis software Dataset 3: Calibrated sediment flux time series 1. Type of data: Processed sediment flux time series data, derived from raw sensor measurements 2. Data standard: Comma-delimited text file (. csv) 3. Data access: Freely available on Zenodo. org upon publication of accompanying article or within 2 years of collection (whichever is sooner), to be assigned DOI for citation 4. Data reuse: CC-BY license (anyone may reuse with attribution) 5. Data preservation: Long-term (>10 years), responsibility of Zenodo. org

PREPARING THE DMP FOR YOUR PROPOSAL A. Outline your DMP B. Plan for each

PREPARING THE DMP FOR YOUR PROPOSAL A. Outline your DMP B. Plan for each data object C. Integrate DMP with your broader proposal – Feasibility of project management, budget, etc.

C. INTEGRATE DMP WITH YOUR BROADER PROPOSAL • Results of Prior NSF Support –

C. INTEGRATE DMP WITH YOUR BROADER PROPOSAL • Results of Prior NSF Support – If you have previously been supported on an NSF award, products of your past data sharing should be listed in this section of the Project Description (see PAPPG II. C. 2. iii. (e)) • Biographical Sketch (“Biosketch”) – “Products” listed on your biosketch can include data sets and other digital products (see PAPPG II. C. 2. f. (c)) • Budget Justification – This should indicate appropriate allocation of time and resources for data management. NOTE: Budget may include data deposit fees (though most NSF-supported repositories do not charge such fees)

PRESENTATION OVERVIEW I. Preparing the DMP for your proposal II. Executing the DMP for

PRESENTATION OVERVIEW I. Preparing the DMP for your proposal II. Executing the DMP for your awarded project III. Additional considerations for DMPs

II. EXECUTING THE DMP FOR YOUR AWARDED PROJECT A. Initiating your project – Coordinate

II. EXECUTING THE DMP FOR YOUR AWARDED PROJECT A. Initiating your project – Coordinate with your project team and stakeholders B. Award reporting – Document data progress and final data publication C. Things to consider after award completion – delayed data release, future proposals

A. INITIATING DATA MANAGEMENT FOR YOUR PROJECT • Coordinate with your project team –

A. INITIATING DATA MANAGEMENT FOR YOUR PROJECT • Coordinate with your project team – Make sure that project roles for data management are clearly allocated to fulfill the objectives of your DMP • Connect to target data facilities – Early on, discuss steps for submitting datasets, including fulfillment of data standards. Sharing your DMP may help. • Get help from your institution – Your library likely offers dedicated research data services to support you on your project. https: //arcticdata. io/submit/

B. AWARD REPORTING FOR YOUR DMP • Where to report – Report on data

B. AWARD REPORTING FOR YOUR DMP • Where to report – Report on data activities in the “Products-Websites” section of your annual and final project reports • Annual Project Report – Provide a status update on progress toward goals of your DMP, including datasets that have been made publicly available within the last year. If your plans have changed, please explain. • Final Project Report – List datasets that have been made publicly available. When listing data, provide a citation with a links back to the repository (e. g. , via DOI); this will streamline compliance checking by Program Director

C. THINGS TO CONSIDER AFTER AWARD COMPLETION Certain Division data policies allow a finite

C. THINGS TO CONSIDER AFTER AWARD COMPLETION Certain Division data policies allow a finite period of time between final data collection and subsequent data publication, which may extend beyond the award completion date. In such cases: • Final Project Report – State plans for future data sharing. It may be possible to pre-populate repository with metadata entry before full data release. • When dataset is published – Inform your managing Program Director • In subsequent NSF proposals – Refer to published data products in

PRESENTATION OVERVIEW I. Preparing the DMP for your proposal II. Executing the DMP for

PRESENTATION OVERVIEW I. Preparing the DMP for your proposal II. Executing the DMP for your awarded project III. Additional considerations for DMPs

III. ADDITIONAL CONSIDERATIONS FOR DMPS • Think of the audience – Be explicit and

III. ADDITIONAL CONSIDERATIONS FOR DMPS • Think of the audience – Be explicit and keep it succinct • Keep it organized – I suggest describing each dataset to be generated. DMP template tools can help (e. g. , DMPtool, ez. DMP) • Get help – Data management experts at your institution’s library or at relevant data facilities can be great resources https: //dmptool. org/

QUESTIONS? NOTE: All statements here are my own and do not necessarily reflect official

QUESTIONS? NOTE: All statements here are my own and do not necessarily reflect official NSF policy. Talk to your Program Director and examine the PAPPG (Proposal & Awards Policies & Procedures Guide), Division data policies, and program solicitations for definitive statements on NSF policy.