Research Data Management From A Publishers Perspective Presentation
Research Data Management From A Publisher’s Perspective Presentation for RDMI Meeting, Industry Panel September 14, 2017 Anita de Waard, a. dewaard@elsevier. com VP Research Data Management, RDS Elsevier
Outline: 1. How has your work in data management enabled research and discovery? 2. What key areas of success has your organization achieved in delivering research data management solutions? 3. What are the greatest challenges you are facing in developing solutions that meet the needs of research data management?
10. Integrate upstream and downstream – make metadata to serve use. 10 Properties of Highly Effective Research Data 9. 8. R 7 e. R 6 e u. T p s 5 ra. C u o b 4 o sd l. C m tu e 3 ip e c. D trd i( 2 ia e b. A sb h (l 1 cle. co P e n re vsg w S e si. strb o e ia lr Use Share Save
10. Integrate upstream and downstream – make metadata to serve use. 10 Properties of Highly Effective Research Data 9. 8. R 7 e. R 6 e u. T p s 5 ra. C u o b 4 o sd l. C m tu e 3 ip e c. D trd i( 2 ia e b. A sb h (l 1 cle. co P e n re vsg w S e si. strb o e ia lr Data Journals: Research Elements Use Research Data Guidelines for Journal Mendeley Data Repository Share Data. Search Hivebench Lab Notebook Save
https: //www. elsevier. com/authors/author-services/research-data/data-guidelines Research Data Guidelines For Journals: Option A: Research Data deposit and citation You are encouraged to: • Deposit your research data in a relevant data repository • Cite this dataset in your article Option B: Research Data deposit, citation and linking (or Research Data Availability Statement) You are encouraged to: • Deposit your research data in a relevant data repository • Cite and link to this dataset in your article • If this is not possible, make a statement explaining why research data cannot be shared Option C: Research Data deposit, citation and linking (or Research Data Availability Statement) You are required to: • Deposit your research data in a relevant data repository • Cite and link to this dataset in your article • If this is not possible, make a statement explaining why research data cannot be shared Option D: Research Data deposit, citation and linking You are required to: • Deposit your research data in a relevant data repository • Cite and link to this dataset in your article Option E: Research Data deposit, citation and linking (or Research Data Availability Statement); You are required to: • Deposit your research data in a relevant data repository • Cite and link to this dataset in your article. • If this is not possible, make a statement explaining why research data cannot be shared • Peer reviewers are asked to review the data prior to publication
HIvebench: Store protocols in an Electronic Lab Notebook. Edit, export, share Keep collection of protocols online https: //www. hivebench. com/
Hivebench: Run experiments from this Lab Notebook. Base on saved Protocols Save and Export Outputs https: //www. hivebench. com/ Edit, export, share
Mendeley Data: Export results to a trusted data repository. Describe how exoeriment can be reproduced Create DOI for Citation Store up to 5 GB of data in many formats https: //data. mendeley. com/ Link back to protocols Keep track of versions of dataset
Data. Search: Search over collection of repositories https: //datasearch. elsevier. com
Data Journals: E. g. Methods. X Link to protocols Journal focuses on Method reporiduction Link to Data Fully OA https: //www. journals. elsevier. com/methodsx
Currently In Development: Mendeley Data Management Platform: Integration with Existing Standards/Systems at Institution
Underway: “Basket of Metrics” & Elsevier Tracking Solutions Goal: More data is saved: Metric: How to measure 1 Stored, i. e. safely available in long-term repository) Nr of datasets stored in long-term storage MD, Pure; Plum Indexes Figshare, Dryad, MD and working on Dataverse. 2. Published, i. e. long-term preserved, accessible Nr of datasets published, in some form Scholix, via web, have a GUID, citeable, with proper Science. Direct/Scopus metadata 3. Linked, to articles or other datasets Nr of datasets linked to articles Scholix, Scopus 4. Validated, by a reviewer/curated Nr of datasets in curated databases/peer Science Direct, Data. Search reviewed in data articles (for curated Dbses) More data is seen and used: 5. Discovered: found by users Nr of datasets viewed in databases/websites/search engines Datasearch, metrics from other search engines/repositories Data. Cite has DOI resolution: made available? 6. Identified: Resolved through a GUID Broker DOI is resolved 7. Mentioned: Social media and news mentions Plum and Newsflo 8. Cited: Formal citations of data Nr of datasets cited in articles Scopus 9. Downloaded: Distinct downloads Downloaded from repositories Downloads from MD, access data from Figshare/Dryad 10. Reused: Dataset is used for new research Mention of usage in article or other dataset SD, access to other data repositories
We need baselines! Example: University of Manchester Data sharing = 19% (well above the average of 5. 5%) 886 random articles checked 9. Re-usable 570 articles without any supplementary/associated data 8. Reproducible (64%); +151 articles with supplementary docs (but not data) 0. 2% 7. Trusted 2 data journal articles (0. 2%) 6. Comprehensible 86 articles with associated data in repositories (9. 7%) 81 articles linked to associated data in a repository (9. 1%) 5. Citable 4. Discoverable 5 articles with no link to a repository (0. 6%) 9. 1% 3. Accessible 79 articles with supplementary data (8. 9%) 0. 6% 2. Preserved 8. 9% 1. Stored Number of articles with linked data deposited in a data repository for 2015 -2017/n=81 G en A e. N DS et JA wo C EU r IO k C nl lin i n ic e al Tr AA ia O lr eg is te r So PD c C B lin io. P at ic t al C ern od s es. o rg G I G Xe D n. B as e G it. H N AR u b O nl in e 86 (9. 8%) M Total links 5 R 81 T Tr ota l ia ls. g Fi ov gs ha re C C D C G EO R ea x IS ys R C TN IE D M A Ea atla rth b C he m D ry ad Ap ol lo LI PI D Links found manually Links found through Scholix al 886 C lin Articles ic Random Selection 90 80 70 60 50 40 30 20 10 0 Total Courtesy Sean Husen and Helena Cousijn (Elsevier)
Open Data Report Reveals Some Challenges: Data sharing survey (with 1167 respondents): • Although 69% of respondents found that sharing data was very important in their field • And 73% wanted to have access to other people’s data, • Only 37% believe there was credit in doing so, • And only 25% felt they had adequate training to properly share their data with others. The main barriers for sharing data were: • privacy concerns, • ethical issues, • intellectual property rights issues. Furthermore: • Mandates from publishers or funding agencies were largely not seen as a driving force => Gap between desire and practice concerning data sharing. https: //data. mendeley. com/datasets/bwrnfb 4 bvh/1
Further Challenge: Who Do We Talk to At An institution?
Further Challenge: How do you ‘Play Well With Others’ when there are so many others (e. g. 47 tools on NDS Labs Workbench) and they are mostly ‘academic’ (i. e. OS, constantly renewed, etc)?
Summary: 1. How has your work in data management enabled research and discovery? • Providing a suite of tools and standards that encourage open, integrated RDM solutions. 2. What key areas of success has your organization achieved in delivering research data management solutions? • • Tools are used (ergo: useful); Developing institutional solutions and data metrics with partners. 3. What are the greatest challenges you are facing in developing solutions that meet the needs of research data management? • • • No great urgency for researchers, inadequate knowledge of possibilities; Distributed responsibility/decision-making processes for RDM; Plethora of tools to integrate with; Difficult to see what the market is (OS, completely? Academic/government? ) > How can publisher play a role? Feel free to email me with any questions! a. dewaard@elsevier. com
- Slides: 17