Open Data Open Materials Open Science Workshop more

  • Slides: 59
Download presentation
Open Data & Open Materials Open Science Workshop more information available on the last

Open Data & Open Materials Open Science Workshop more information available on the last slide

How to use these slides • Use the whole slide deck as a 1.

How to use these slides • Use the whole slide deck as a 1. 5 -2 hours presentation • Use parts of the slides as part of a longer workshop – Ideas for workshop concepts can be found in the README folder • Remix them with your own or other slides, change the layout, refine the content… (everything granted by the CC-BY license)

Open Science in the research process Replication study Replicate results Preregistration Power Analysis Formulate

Open Science in the research process Replication study Replicate results Preregistration Power Analysis Formulate hypotheses & analysis plan Registered Report (1 st phase) Open Access Publish & distribute research output Registered Report Collect data Open Lab Notebook (2 nd phase) Open Data Interpret & report results Open Materials Analyze data Open Analysis Code get all material here: https: //osf. io/zjrhu/

What is Data? What is data? “The recorded factual material commonly retained by and

What is Data? What is data? “The recorded factual material commonly retained by and accepted in the scientific community as necessary to validate research findings. ” (EPSRC, 2018) Anything and everything produced in the course of research. Lynch (2014) Icon from flaticon. com by freepic

What is Data? Pictures from en. wikipedia. org/w/index. php? curid=36808161; pixabay. com; pxhere. com/en/photo/1192476;

What is Data? Pictures from en. wikipedia. org/w/index. php? curid=36808161; pixabay. com; pxhere. com/en/photo/1192476; pxhere. com/en/photo/914805; pxhere. com/en/photo/595225; commons. wikimedia. org/wiki/File: Gdp_real_growth_rate_2007_CIA_Factbook. PNG

Example: Psychology https: //psyarxiv. com/vhx 89

Example: Psychology https: //psyarxiv. com/vhx 89

What is Open Data? Open Data “Open data should be available to everyone to

What is Open Data? Open Data “Open data should be available to everyone to access, use, and share. ” (GO FAIR, 2018) Icon from flaticon. com by freepic

Why Open Data? Win the trust of other researchers Others may derive new insights

Why Open Data? Win the trust of other researchers Others may derive new insights from your data that you did not think of (secondary use) Never again lose unpublished data (e. g. , crashed hard drive) Comply with the guidelines of funding agencies (e. g. , NIH, ERC, Wellcome trust, DFG, Schweizer Nationalfond), see for example DFG Guidelines for Handling Research Data www. dfg. de/en/research_funding/proposal_review_decision/applicants/resear ch_data/index. html#anker 62237206 Icon from flaticon. com by freepic Arslan (2018)

How to make your data open The FAIR principles Findable Accessible Interoperable Reusable Good

How to make your data open The FAIR principles Findable Accessible Interoperable Reusable Good research data management Icon from flaticon. com by freepic, mynamepong, monkik FORCE 11(2011)

What is “findable”? Metadata and data should be easy to find for both humans

What is “findable”? Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services. https: //www. go-fair. org/fair-principles/ Icon from flaticon. com by freepic

Make your data findable Step 1: Find a home for your data Data Repository

Make your data findable Step 1: Find a home for your data Data Repository Pictures from pexels. com by Public Domain Pictures; pxhere. com/de/photo/1032400. com

Make your data findable Step 1: Find a home for your data Many data

Make your data findable Step 1: Find a home for your data Many data repositories are specialized on a field of research or bound to a geographic region / university Life Sciences & Medicine Neuroimaging Engineering Archaeology

Exercise: Data Repositories Find a data repository from your field which allows you to

Exercise: Data Repositories Find a data repository from your field which allows you to upload your data using re 3 data. org

Make your data findable Step 1: Find a home for your data What can

Make your data findable Step 1: Find a home for your data What can you do if no specialized data repository exists for your research topic? University data repositories General data repositories

Make your data findable Step 2: Give your data a DOI (or another persistent

Make your data findable Step 2: Give your data a DOI (or another persistent identifier) A digital object identifier (DOI) is a unique alphanumeric string assigned by the International DOI Foundation to identify content and provide a persistent link to its location on the internet. APA (2018)

Make your data findable Step 3: Describe your data with metadata Picture by Ltakemoto

Make your data findable Step 3: Describe your data with metadata Picture by Ltakemoto on commons. wikimedia. org/wiki/File: Sortcolumn. JPG

Make your data findable Step 3: Describe your data with metadata Meta-Data „Data that

Make your data findable Step 3: Describe your data with metadata Meta-Data „Data that provides information about other data“ https: //www. merriam-webster. com/dictionary/metadata

Make your data findable Step 3: Describe your data with metadata Variable encoding? Dataset

Make your data findable Step 3: Describe your data with metadata Variable encoding? Dataset dimension? Who collected the data? Data processing? Data version? Time of data collection? Codebook Purpose of data collection? Icon from flaticon. com by Smashicons Place of data collection? License? Observation unit? DDI Alliance (2018)

Make your data findable Step 3: Describe your data with metadata: Codebook Variable encoding

Make your data findable Step 3: Describe your data with metadata: Codebook Variable encoding • Variable name (e. g. , GENDER) • Label (short description, e. g. , „Gender of the participant“) • Variable type (e. g. , continuous / discrete) • Value labels (e. g. , 1=male, 2=female, 3=other) • Missing value code (e. g. , 99=missing) • Assessment method (e. g. , question: „With which gender do you identify? “) DDI Alliance (2018)

Make your data findable Step 3: Describe your data with metadata Consider existing metadata

Make your data findable Step 3: Describe your data with metadata Consider existing metadata standards • www. rd-alliance. org/ Picture from rd-alliance. org/system/files/documents/GCI-Shelter-Ontology. pdf DDI Alliance (2018)

Make your data findable Step 3: Describe your data with metadata: Open Lab Notebooks

Make your data findable Step 3: Describe your data with metadata: Open Lab Notebooks • Keep track of the data generating process • Document the experimental process Picture from pexels. com by Messala Ciulla

Make your data findable Step 3: Describe your data with metadata: Open Lab Notebooks

Make your data findable Step 3: Describe your data with metadata: Open Lab Notebooks food blog 1 st try 200 g sugar 200 g butter 300 g flour 1 egg too sweet! Picture from pexels. com by Skitterphoto 2 nd try 100 g sugar 400 g butter 300 g flour 1 egg 3 rd try 100 g sugar 200 g butter 300 g flour 1 egg dough too sticky! Ugh, the egg was bad! N-th try 100 g sugar 200 g butter 300 g flour 1 fresh egg perfect!!!

Make your data findable Step 3: Describe your data with metadata: Open Lab Notebooks

Make your data findable Step 3: Describe your data with metadata: Open Lab Notebooks what others know if you don‘t share your notes what you know 1 st try 200 g sugar 200 g butter 300 g flour 1 egg too sweet! Picture from pexels. com by Skitterphoto 2 nd try 100 g sugar 400 g butter 300 g flour 1 egg 3 rd try 100 g sugar 200 g butter 300 g flour 1 egg dough too sticky! Ugh, the egg was bad! N-th try 100 g sugar 200 g butter 300 g flour 1 fresh egg perfect!!!

Make your data findable Step 3: Describe your data with metadata: Open Lab Notebooks

Make your data findable Step 3: Describe your data with metadata: Open Lab Notebooks Picture from test. scinote. net/product/tour/

How to make your data open The FAIR principles Findable Accessible Interoperable Reusable Good

How to make your data open The FAIR principles Findable Accessible Interoperable Reusable Good research data management Icon from flaticon. com by freepic, mynamepong, monkik FORCE 11(2011)

What is “accessible”? Once the user finds the required data, she/he needs to know

What is “accessible”? Once the user finds the required data, she/he needs to know how can they be accessed, possibly including authentication and authorisation. https: //www. go-fair. org/fair-principles/ Icon from flaticon. com by freepic

Make your data accessible “Open data should be available to everyone to access, use,

Make your data accessible “Open data should be available to everyone to access, use, and share. ” (GO FAIR, 2018) What about sensitive data?

Make your data accessible Examples for sensitive data topics sexual life political opinion crimes

Make your data accessible Examples for sensitive data topics sexual life political opinion crimes health For more information rare animals national security Icon from flaticon. com by monkik, Business Dubai, Roundicons, mynamepong, Good Ware https: //www. datenschutzbayern. de/datenschutzreform 201 8/ueberblick-3. html Arslan (2018)

Make your data accessible What can you do if your data is sensitive? •

Make your data accessible What can you do if your data is sensitive? • Is all your data sensitive? Maybe you can openly share parts of your data. • Restrict the access to your data to a relevant group (e. g. , to researchers) and be clear and transparent about why you restrict the access and how people can gain access • Publish only metadata More information in our Privacy module on https: //osf. io/zjrhu/ Arslan (2018), ANDS (2015)

Make your data accessible A prerequisite of accessibility: Ask for consent Consent form template

Make your data accessible A prerequisite of accessibility: Ask for consent Consent form template on https: //osf. io/wr 2 p 7/ Open Science Committee @psych. LMU (2017)

How to make your data open The FAIR principles Findable Accessible Interoperable Reusable Good

How to make your data open The FAIR principles Findable Accessible Interoperable Reusable Good research data management Icon from flaticon. com by freepic, mynamepong, monkik FORCE 11(2011)

What is “interoperable”? The data usually need to be integrated with other data. In

What is “interoperable”? The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing. https: //www. go-fair. org/fair-principles/ Icon from flaticon. com by mynamepong

Make your data interoperable Interoperability: Do not use proprietary data formats or software Format

Make your data interoperable Interoperability: Do not use proprietary data formats or software Format / Software Proprietary Open Text files Word (. doc), Pages (. pages) Open Office (. odt), . txt, La. Te. X Spreadsheets Excel (. xls), Numbers (. numbers) Open Office (. ods), . csv Video . avi, . wmv, . mov, . qtvr, . rv . mpg, . mp 4 Audio . wma, . asf, . ra, . wav . mp 3 Presentations Power. Point (. ppt), Keynote PDF, HTML (. key) Statistical Analyses SPSS (. sav), Matlab (. m), SAS (. sas), Stata (. dta) R, JASP (. jasp), Python Experimental Software / Questionnaires E-Prime, Survey. Monkey, Uni. Park Psycho. Py, Limesurvey, formr Schwietering (2018)

Make your data interoperable Interoperability: Use data formats that still can be read in,

Make your data interoperable Interoperability: Use data formats that still can be read in, say, 50 years • Ever tried to open a pre-2000 SPSS file? • Human readable data > cryptic format • e. g. , csv, XML, json: Can be in principle understood by humans. • compressed binary Rdata files (although open source and non-proprietary) cannot be understood by humans – only afgter decrypting with the respective software • Plain text files are less efficient, but more future proof

How to make your data open The FAIR principles Findable Accessible Interoperable Reusable Good

How to make your data open The FAIR principles Findable Accessible Interoperable Reusable Good research data management Icon from flaticon. com by freepic, mynamepong, monkik FORCE 11(2011)

What is “reusable”? The ultimate goal of FAIR is to optimise the reuse of

What is “reusable”? The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be welldescribed so that they can be replicated and/or combined in different settings. https: //www. go-fair. org/fair-principles/ Icon from flaticon. com by monkik

Make your data reusable 1. Choose a license Regulate clearly what others are allowed

Make your data reusable 1. Choose a license Regulate clearly what others are allowed to do with the data More information on licenses is in our Open Access module on https: //osf. io/ebx 48/ Image from en. wikipedia. org/wiki/Creative_Commons_license

Make your data reusable 2. Make your analysis code reproducible • • • !!!

Make your data reusable 2. Make your analysis code reproducible • • • !!! Always comment your code !!! Choose a coherent file / function naming system and coding style Consider version control Record the used packages and software Write a README with details on the workflow if code fragments need to be combined British Ecological Society (2017)

Exercise (1) Let’s produce some open data! https: //osworkshop 1. formr. org

Exercise (1) Let’s produce some open data! https: //osworkshop 1. formr. org

Exercise (2) Let us make the data open! • In which data format should

Exercise (2) Let us make the data open! • In which data format should we download the data?

Exercise (2) Let us make the data open! • In which data format should

Exercise (2) Let us make the data open! • In which data format should we download the data? • Will the dataset be of use in the current form?

Exercise (2) Let us make the data open! • In which data format should

Exercise (2) Let us make the data open! • In which data format should we download the data? • Will the dataset be of use in the current form? • Upload the data and the codebook on an online repository

Exercise (2) Let us make the data open! • • In which data format

Exercise (2) Let us make the data open! • • In which data format should we download the data? Will the dataset be of use in the current form? Upload the data and the codebook on an online repository Give the data a DOI Note for OSF! http: //help. osf. io/m/sharing/l/524208 -create-dois • DOIs are on the level of a project, not on the level of single files • DOIs for public projects point to the current version of the project. OSF does not support DOI versioning at this time ➙ That means, the URL location is persistent, but not the content of the project! • Project must be public to create a doi.

Exercise (2) Let us make the data open! • • • In which data

Exercise (2) Let us make the data open! • • • In which data format should we download the data? Will the dataset be of use in the current form? Upload the data and the codebook on an online repository Give the data a DOI Choose a license for the data (creativecommons. org/choose/? lang=en) Recommended: CC 0 or CC-BY

Congratulations, you earned the Open Data badge!

Congratulations, you earned the Open Data badge!

Open Material Do potential users of the data already have enough information about this

Open Material Do potential users of the data already have enough information about this question: • What do you associate with this picture? Picture from pxhere. com/de/photo/464180

Congratulations, you earned the Open Materials badge!

Congratulations, you earned the Open Materials badge!

Advocatus Diaboli But. . . I am afraid of legal consequences. • Do you

Advocatus Diaboli But. . . I am afraid of legal consequences. • Do you know of any lawsuits against researchers who published their data? (I don‘t) • If you are working with human subjects, treat them kindly and always give them a consent form • If you are collecting sensitive data, you can always restrict the access and only make your metadata openly available. More information in our Privacy module on https: //osf. io/zjrhu/

Advocatus Diaboli But. . . people might get demotivated to collect their own data.

Advocatus Diaboli But. . . people might get demotivated to collect their own data. Open data will lead to social loafing. • Why should people collect new data if they can answer their research questions with already existing data? This would be a waste of resources. • Remember that if you published your data, this is already your publication, so other researchers will need to cite you in exchange for your data. Often you might also be asked to be a coauthor. for more information on this discussion see doi. org/10. 1056/NEJMe 1516564

Advocatus Diaboli But. . . if people will use the same datasets over and

Advocatus Diaboli But. . . if people will use the same datasets over and over again, this will lead to overfitting. • Yes, if everyone uses the same dataset, this might be problematic. But the more datasets are published, the less one specific dataset will be used. • If someone finds exploratory results in a commonly used dataset, it would be good research practice to replicate the results in a confirmatory study with new data.

Advocatus Diaboli But. . . people will misinterpret my data and use it for

Advocatus Diaboli But. . . people will misinterpret my data and use it for the wrong purpose. • This becomes increasingly less likely if you publish detailed metadata alongside with your data.

Advocatus Diaboli But. . . if I publish my data alongside with my study,

Advocatus Diaboli But. . . if I publish my data alongside with my study, people will find out if I made a mistake. • If it is an honest mistake, there is no need to be afraid. Everybody makes mistakes, even researchers. As long as you did not engage in scientific misconduct (e. g. , faking your data), your credibility will not suffer. Furthermore, selfcorrection is vital function of science, and by opening up your data, you enable self-correction.

Advocatus Diaboli But. . . it takes SO much time to make my data

Advocatus Diaboli But. . . it takes SO much time to make my data open! • Yes, it is definitely some effort to document the data well enough so that others can use them. But this assures that you will be able to use them yourself after some years, too. Although sometimes neglected, good research data documentation should be a normal part of the research workflow.

Open Data: Who to ask? • Ask Open Science Initiative (University of Bielefeld) https:

Open Data: Who to ask? • Ask Open Science Initiative (University of Bielefeld) https: //ask-open-science. org/ • Your institutional ethics committee (for questions about sharing potentially sensitive data prior which occur prior to your study) • The IT support of your university / lab (for questions about data encrypting, secure data storage etc. )

Open Data: 3 Easy Steps How you can improve your OS record (almost) without

Open Data: 3 Easy Steps How you can improve your OS record (almost) without effort 1. Upload (stimulus) material that you create on an open repository 2. Ask to see the data when you are reviewing a paper & recommend open sharing when possible 3. Hand out a standard consent form to your participants before you conduct a study; ask for consent to publish the data.

Open Data: What you learned • • • What is (meta)data? What are Open

Open Data: What you learned • • • What is (meta)data? What are Open Data and Open Materials? Why should you make your data open? The FAIR principles of open data How to find a suitable data repository How to assign a DOI to your data How to describe your data with a codebook What is an Open Lab Notebook? What are sensitive data? What are interoperable data formats?

Further Resources • ANDS (2015). The FAIR data principles. Available on ands. org. au/working-with-data/fairdata

Further Resources • ANDS (2015). The FAIR data principles. Available on ands. org. au/working-with-data/fairdata • Arslan (2018). Maintaining privacy with open data. Presentation slides available on osf. io/9 j 27 d/ • British Ecological Society (2017). A guide to reproducible code in ecology and evolution. Report available on www. britishecologicalsociety. org/wpcontent/uploads/2017/12/guide-to-reproducible-code. pdf • DDI Alliance (2018). Create a Codebook. Available on ddialliance. org/training/getting-started-new-content/create-acodebook; example for very detailed codebook can be found on goo. gl/e 3 QNCU. • Open Science Committee @psych. LMU (2017). Einverständniserklärung. Available on osf. io/wr 2 p 7/

References • • • APA (2018). What is a digital object identifier, or DOI?

References • • • APA (2018). What is a digital object identifier, or DOI? Available on apastyle. org/learn/faqs/what-isdoi. aspx ANDS (2015). The FAIR data principles. Available on ands. org. au/working-with-data/fairdata Arslan (2018). Maintaining privacy with open data. Presentation slides available on osf. io/9 j 27 d/ British Ecological Society (2017). A guide to reproducible code in ecology and evolution. Report available on www. britishecologicalsociety. org/wp-content/uploads/2017/12/guide-to-reproducible -code. pdf DDI Alliance (2018). Create a Codebook. Available on ddialliance. org/training/getting-started-newcontent/create-a-codebook; example for very detailed codebook can be found on goo. gl/e 3 QNCU EPSRC (2018). EPSRC policy framework on research data: Scope and benefits. Available on epsrc. ukri. org/about/standards/researchdata/scope/ FORCE 11(2011). FAIR data principles. Available on www. force 11. org/group/fairprinciples GO FAIR (2018). What is the difference between ‚FAIR data‘ and ‚Open data‘ if there any? Available on go-fair. org/faq/ask-question-difference-fair-data-open-data/ Lynch, H. (2014). Because good research needs good data. Research Data Management for Researchers University of Aberdeen 7 th October 2014 Jonathan Rans Digital Curation. Presentation slides available on slideplayer. com/slide/4646268/ Open Science Committee @psych. LMU (2017). Einverständniserklärung. Available on osf. io/wr 2 p 7/ Schwietering, H. (2018). Wiki/Freie Standards. Available on wiki. ubuntuusers. de/Freie_Standards/

Credentials The creation of this workshop material was partially funded by the Berkeley Initiative

Credentials The creation of this workshop material was partially funded by the Berkeley Initiative for Transparency in the Social Sciences (BITSS) Catalyst Program. For more information, please visit www. bitss. org, sign up for the BITSS blog, and follow BITSS on Twitter @UCBITSS. We also kindly thank the LMU Graduate. Center for their support. These slides were created by Angelika Stefan, Julia Brandt, and Felix Schönbrodt. The work is licensed under a Creative Commons Attribution 4. 0 International License. That means, you can reuse this slides in your own workshops, remix them, or copy them, as long as you attribute the original creators.