Dataverse at Scholars Portal Alan Darnell Director Scholars

  • Slides: 14
Download presentation
Dataverse at Scholars Portal Alan Darnell Director, Scholars Portal

Dataverse at Scholars Portal Alan Darnell Director, Scholars Portal

Ontario Council of University Libraries (OCUL) 21 Libraries 450, 000 FTE Scholars Portal is

Ontario Council of University Libraries (OCUL) 21 Libraries 450, 000 FTE Scholars Portal is the technology support service of OCUL … Tackle problems that are too big for any one institution

Numeric Data Published data, highly curated

Numeric Data Published data, highly curated

Geospatial Data Published data, highly curated

Geospatial Data Published data, highly curated

SP Research Data Repository dataverse. scholarsportal. info Thank you IQSS !

SP Research Data Repository dataverse. scholarsportal. info Thank you IQSS !

Dataverse 3 • Open to any researcher – 77 published datasets – 472 studies

Dataverse 3 • Open to any researcher – 77 published datasets – 472 studies – 6, 357 files • Slow but growing uptake from libraries – 12 institutional dataverses • Wide range of file formats – WARC files, Twitter feeds, spreadsheets, documents, historical census data, survey files, image files, weather data, etc…

Dataverse 4 Stronger institutional focus Data. Cite DOIs Shibboleth • Canadian Access Federation Internationalization

Dataverse 4 Stronger institutional focus Data. Cite DOIs Shibboleth • Canadian Access Federation Internationalization (coming soon) September 2016

A wish list

A wish list

Ontario Library Research Cloud cost-effective long-term storage for digital assets Utilize existing network and

Ontario Library Research Cloud cost-effective long-term storage for digital assets Utilize existing network and data center facilities in Ontario universities to build a PB-scale distributed storage network using Open. Stack object storage (Swift) and commodity storage hardware 5 nodes / 370 TB Ottawa, Queens, York, Toronto, Guelph

Wish 1 : Big Data • Support for in place ingestion of files stored

Wish 1 : Big Data • Support for in place ingestion of files stored in the cloud • Storage Model that supports block and object services – Open. Stack Swift & S 3 – own. Cloud and Drop. Box

Dataverse > Archivematica > Swift Dashboard Storage Service https: //wiki. archivematica. org/Dataverse Image Credit:

Dataverse > Archivematica > Swift Dashboard Storage Service https: //wiki. archivematica. org/Dataverse Image Credit: Julie Allinson, University of York OLRC

Wish 2 : Digital Preservation • PREMIS – Standard vocabulary to record preservation actions

Wish 2 : Digital Preservation • PREMIS – Standard vocabulary to record preservation actions like ingest, transformation • PRONOM – Enhanced file identification – droid, Siegfried, FIDO • METS – Structural representation of complex digital objects • Native XML Export – Concern about JSON as a preservation format

Wish 3 : Plugin Architecture • Allow domain specialists to extend file support through

Wish 3 : Plugin Architecture • Allow domain specialists to extend file support through a plugin architecture – Encourage and enable community contributions Methods – – – Describe Thumbnail View Download Explore Transform My New File Format

Wish 4 : Tools for Analysis • Jupyter and Zeppelin are interactive web based

Wish 4 : Tools for Analysis • Jupyter and Zeppelin are interactive web based tools used for analysis of a wide range of data formats • Use of Apache Spark as a processing engine for big data http: //jupyter. org https: //zeppelin. apache. org