Big Data to Knowledge Open Educational Resources Development

Big Data to Knowledge Open Educational Resources: Development and Dissemination Considerations Bjorn Pederson 1, Nicole Vasilevsky 1, 2, William Hersh 1, Shannon Mc. Weeney 1, Melissa Haendel 1, 2, Jackie Wirz 1, 2 1 Department 2 Ontology of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR Development Group, Library, Oregon Health & Science University, Portland, OR This work supported by NIH Grant 1 R 25 GM 114820

Objective • Develop open educational resources (OERs) for use in courses, programs, workshops, and related activities concerning the greater understanding of Big Data.

The 4 V’s

Target audiences • Beginning informatics graduate students • Established investigators and senior trainees • Advanced undergraduates exploring future career paths into data science • Established professionals who need to apply BD 2 K concepts in their present jobs

Approach • Modeled after successful Office of the National Coordinator for Health IT (ONC) curriculum materials • https: //knowledge. amia. org/onc-ntdc • (Mohan et al. , JAMIA, 2014) • Value of using the ONC Health IT Curriculum approach includes open format with both • “Out of the box” content • Source materials for that content

Modules • Learning objectives • Narrated and source slides • Detailed references • Data exercises

Modules Introductory – Completed 13. Data Metadata And Provenance Advanced – In Development 1. Biomedical Big Data Science 14. Semantic Data Interoperability 23. 2. Introduction To Big Data In Biology And Medicine 15. Choice Of Algorithms And Algorithm Dynamics 24. 3. Ethical Issues In Use Of Big Data 16. Visualization And Interpretation 25. Data modeling 4. Clinical Data Standards Related To Big Data 17. Replication, Validation And The Spectrum Of Reproducibility 26. Semantic Web data 5. Basic Research Data Standards 18. Regulatory Issues In Big Data For Genomics And Health 27. Context-based selection of data 6. Public Health And Big Data 28. Translating the Question 7. Team Science 8. Secondary Use (Reuse) Of Clinical Data 9. Publication And Peer Review 10. Information Retrieval 11. Data Annotation and Curation 12. Ontologies 101 Terminology of Biomedical, Clinical, and Translational Research Computing Concepts for Big Data 19. Hosting Data Dissemination And Data Stewardship Workshops 29. Implications of Provenance and Preprocessing 20. Guidelines For Reporting, Publications, And Data Sharing 30. Data tells a story 31. Statistical Significance, P-hacking and Multiple-testing 32. Displaying Confidence and Uncertainty Introductory – Nearly Completed 21. Version control and identifiers 22. Data and tools landscape

http: //skynet. ohsu. edu/bd 2 k 8







Challenges • Scope • How to scope generic curricula for different levels of users • Style • How to translate diverse teaching styles into general materials • Images • How to incorporate images and other copyrighted materials into open resources • Dissemination • How to maximize dissemination while protecting intellectual property

Current stats and future directions • 20 modules completed; 2 nearing completion; 9 needing additional development • Mapped to Clinical & Translational Science Awards (CTSA) Program biomedical informatics competencies • http: //skynet. ohsu. edu/bd 2 k/mapping. html • Showing a practical use of the material. • Developing evaluation form to solicit feedback • Will seek IRB approval so we can publish results
- Slides: 16