Emulation asaService Infrastructure Euan Cochrane Digital Preservation Manager

  • Slides: 21
Download presentation
Emulation as-a-Service Infrastructure Euan Cochrane Digital Preservation Manager Yale University Library 9/18/18 Pres. QT

Emulation as-a-Service Infrastructure Euan Cochrane Digital Preservation Manager Yale University Library 9/18/18 Pres. QT Meeting

Our Team 2

Our Team 2

○ Euan Cochrane Principal Investigator ○ Seth Anderson Program Manager ○ Klaus Rechert &

○ Euan Cochrane Principal Investigator ○ Seth Anderson Program Manager ○ Klaus Rechert & Oleg Stobbe (Open. SLX) Technical Architecture and Development ○ YUL Library ITUI Development ○ Jessica Meyerson (Educopia/SPN) Communications/Outreach ○ Kat Thornton (Data Current/Wiki. DP) Semantic Architect 3

Partners ○ ○ ○ Funded by the Andrew W. Mellon Foundation and the Alfred

Partners ○ ○ ○ Funded by the Andrew W. Mellon Foundation and the Alfred P. Sloan Foundation ($1, 000 each over 30 months) Hosted by Yale University Library Collaborating with the Software Preservation Network (SPN) Using and contributing metadata to Wikidata Software development by Open SLX (Eaa. S developers) and Yale Library Initial Node-host partners include: ○ Carnegie Mellon University (Eric Kaltman, Node Lead) ○ Notre Dame University (Don Brower, Node Lead) ○ Stanford University (Michael Olson, Node Lead) ○ University of California - San Diego (Sibyl Schafer, Node Lead) ○ University of Virginia (Robert German, Node Lead) 4

Eaa. SI Advisors ○ ○ ○ Stakeholder organizations (potential future node-hosts): The Harvard/Smithsonian Center

Eaa. SI Advisors ○ ○ ○ Stakeholder organizations (potential future node-hosts): The Harvard/Smithsonian Center for Astrophysics Digital Preservation Coalition Meta. Archive Bit. Curator Consortium National Library of France (Bn. F) Open Preservation Foundation Texas Digital Library of Congress Group is growing and open for new members. Quarterly virtual meetings 1. 2. 3. 4. 5. 6. 7. 8. 5

Eaa. SI points of difference for Pres. QT ○ Eaa. SI is establishing basic/fundamental,

Eaa. SI points of difference for Pres. QT ○ Eaa. SI is establishing basic/fundamental, shared Infrastructure targeted at many generic use-cases ○ We’re aiming to enable other services to be built on top of Eaa. SI ○ Eaa. SI is focused on the (very) long term ○ From the late twentieth century through to future generations 6

Emulation as a Service (Eaa. S) 7

Emulation as a Service (Eaa. S) 7

Eaa. S simplifies access to, and provides a generic API for, various emulators and

Eaa. S simplifies access to, and provides a generic API for, various emulators and KVM Universal Amiga Emulator 8

Eaa. S provides web-based access to emulators and KVM preconfigured to run numerous Operating

Eaa. S provides web-based access to emulators and KVM preconfigured to run numerous Operating Systems Customization of emulator/KVM parameters is also available: 9

Eaa. S: Features of note ○ ○ ○ ○ ○ Seamlessly move environments/containers to

Eaa. S: Features of note ○ ○ ○ ○ ○ Seamlessly move environments/containers to emulated hardware from virtualized or physical hardware as technology ages Sophisticated virtual hard disk management ○ Dynamically translates between HDD image formats (vdi, vmdk, raw, qcow 2, etc) ○ Supports linked disk images/environments that are dependent on parent images Can print to PDF from any environment with a post-script printer driver (universal PDF conversion) Handles/Persistent identifiers available for configured environments Internet access within configured environments an optional feature Environments can be paused and resumed on demand Generic Application Programming Interface ○ Interact with emulation session and disk images ○ Upload/attach content ○ Download changed files Networked emulated environments and environment isolation and access (e. g. via proxy) Documentation at: http: //emulation. solutions/ 10

A note on Disk Images and Derivatives As David Rosenthal always says: digital preservation

A note on Disk Images and Derivatives As David Rosenthal always says: digital preservation is primarily an economic problem 11

Disk Image derivatives provide economies of scale and save a lot of money over

Disk Image derivatives provide economies of scale and save a lot of money over the long term 12

Derivatives: example 13

Derivatives: example 13

14

14

15

15

Eaa. SI Program Goals 1. 2. 3. 4. 5. Establishing a network of nodes

Eaa. SI Program Goals 1. 2. 3. 4. 5. Establishing a network of nodes running Emulation as a Service Enabling secure sharing of pre-configured software environments(disk images + metadata) and their derivatives between nodes in the network Seeding the network with 3000+ pre-configured software environments running legacy software applications e. g. ○ Auto. CAD ○ SPSS ○ STATA ○ Ubuntu + Docker V. x ○ SUSE + Repro. Server/Repro. ZIP ○ Windows XP + SAS V. x Creating extensive metadata and sharing much of it via Wikidata. org Creating a plan for the Long term sustainability of the Eaa. SI network and Eaa. SI program outputs 16

Eaa. SI Program Goals 6. 7. Building an API driven by that metadata to

Eaa. SI Program Goals 6. 7. Building an API driven by that metadata to enable programmatic access to the pre-configured environments – a “Universal Virtual Interactor” ○ E. g. submit a file and get it back in your browser for interaction in the “original” software Building interfaces on top of that infrastructure for various use-cases including: a. Sharing pre-configured environments that include published CD-ROMs for use by libraries that own them b. Virtual Reading Room functionality for sharing custom environments containing restricted data with Patrons for limited time periods c. Scientific Software and Reproducibility ? 17

Scientific Software and Reproducibility Known options: Eaa. SI Custom Interface: ○ ○ A “view”

Scientific Software and Reproducibility Known options: Eaa. SI Custom Interface: ○ ○ A “view” on configured environments relevant to scientific use-cases E. g. Scientific software in an emulated computer ○ Workflows customized for working with scientific software environments Ci. TAR (University of Freiberg, Eaa. S-based) ○ Long term Container preservation via normalization to Open Container Format Data input/output apis Packaging containers, input and output data Persistent IDs Networked environments ○ ○ Unknown: ○ New Services built on the Eaa. SI network? 18

Demos (? ) available 19

Demos (? ) available 19

A Very Special Thanks to our Funders. . . 20

A Very Special Thanks to our Funders. . . 20

Thank you! You can find me at ○ ○ https: //twitter. com/euanc euan. cochrane@yale.

Thank you! You can find me at ○ ○ https: //twitter. com/euanc euan. cochrane@yale. edu Learn more at ○ ○ ○ www. softwarepreservationnetwork. org/eaasi @Soft. Pres. Network #join. SPN 21