Virginia Tech Libraries Next Gen Digital Libraries Platform





























- Slides: 29

Virginia Tech Libraries’ Next Gen Digital Libraries Platform Yinlin Chen and James Tuttle {ylchen, james. tuttle}@vt. edu Virginia Tech Libraries

Agenda • • Problem Space DLD Projects Cloud-native Serverless & Microservices Virginia Tech Digital Library Platform (VTDLP) Architecture Overview Outcome Next Steps

Problem Space • • • Numerous, web applications with similar stacks stretching resources Limited in-house capacity to address performance, resilience, and scaling Library-specific software requires training or competing for few experienced library devs

DLD Projects • • • Fish. Traits database ETDplus VTech. Data Geo. Data Collab. VT Fedora VTDLP IAWA …… On-premises Cloud-native (AWS) Servers (VMs, instances) Serverless

Cloud Native • Entire infrastructure is deployed in the Cloud (AWS) • Platform is composed of a suite of microservices and managed services • Focus on the business logic and workflow • Utilize the advantages provided by the Cloud – fault-tolerant, auto-scale, update/rollback without downtime, etc. • Facilitate the development process • Optimize resource utilization • Optimize and reduce cost

Resource Usage Optimization and Automation • • Consume only the required resources for the applications Scale up and down automatically Service and function oriented, not server oriented Utilize cloud services to help understand applications (Cloud. Watch, Auto Scaling, Trusted Advisor, etc. )

Serverless Does not mean “There are no servers at all”. Does mean “Use fully managed services”. Focus on application development, not server maintenance

Microservice • Small applications that do one thing well • Messaging enabled – communicate with messages • Decentralized – – Autonomously developed Independently deployable Can change independently of each service Scale individually by load • Built and released with automated processes • More complex architecture

Shop. LEGO. com serverless on AWS Images from Lego AWS: reinvent 19 presentation

Continuous Integration and Delivery (CI / CD) AWS Code. Pipeline Source Stage Build Stage AWS Code. Build Test Stage Deploy Stage AWS Elastic Beanstalk Amazon S 3 Amazon EC 2

Virginia Tech Digital Library Platform (VTDLP) Preservation Data Modeling Presentation • New services to Digital Library Platform – ID Minting service, Access Service, Metadata service, … • Migrating legacy services to Digital Library Platform – IAWA, VTech. Work, … • A Multi-Tenancy Cloud-Native Digital Library Platform – OR 2019

VTDLP Overview Presentation Preservation staging Vtech. Work ETDs IAWA Images Serialization Service Resolution Service IAWA Beyond. VT ID Minting Service Metadata Service SW Virginia Others Batch Metadata Service Storage Others Other Services Amazon S 3 . . . APTrust

AWS Cloud Amazon S 3 Amazon Elasticsearch Service Web App Amazon Route 53 Amazon Cloud. Front Amazon API Gateway AWS Certificate Manager AWS Lambda Amazon Dynamo. DB Amazon Cognito

Presentation - Multi-Tenant Architecture App 1 App 2 App. N Application Hub DB Search

CI/CD with AWS (4) (3) Amazon S 3 AWS Code. Build (1) (2) (6) Developers AWS Amplify (5) AWS Lambda AWS Cloud. Formation (7) Amazon API Gateway

Automatic CI/CD Pipeline

A New Version for each Pull Request

The International Archive of Women in Architecture • • A level 0 compliant image server using Amazon S 3 and Amazon Cloud. Front Tiles images, manifest JSON files, and etc. Terabytes of scan images to be processed Scaling IIIF image tiling in the cloud – Code 4 Lib Journal (To be published)

Image processing workflow AWS Batch Amazon S 3 Raw images Batch Job – image set 1 Batch Job – image set 2 Amazon EC 2 Amazon Cloud. Watch AWS Lambda Batch Job – image set 3 Rule Amazon Elastic File System Batch Job – image set N Amazon S 3 Tiles & Manifest

Batch job - IIIF_S 3 Docker AWS Batch • • • Command Parameters Environment variables v. CPUs Memory IIIF Amazon S 3 Tiles & Manifest Amazon Elastic File System

Automatic Data Process Pipeline

Microservice – Using AWS Lambda

Metadata Transformation Using AWS Lambda

Outcomes • • Developer/Dev. Ops candidate pool much larger Automated compliance with Digital Preservation Best Practices Benefits of tiered storage for long-term data archiving Performance improvements even without optimization

Performance improvement before optimization

Site performance Collection page Search page

Demo https: //iawa-dev. cloud. lib. vt. edu/

Next Steps • • • Docker and kubernetes for reproducible builds and orchestration between cloud and local Exploring local infrastructure changes e. g. Ceph storage Benchmarking and cost optimization of cloud services Refactoring of legacy applications to AWS Cloud. Formation or Terraform for everything

Q&A Thank You!
NEXT NEXT NEXT NEXT NEXT NEXT Diffusion 5
NEXT NEXT NEXT NEXT NEXT NEXT Culto Virgen
NEXT NEXT NEXT NEXT NEXT NEXT Guns Germs
Virginia Tech University Libraries A LEARNING PLATFORM September
libraries Cosmid libraries BAC libraries Cosmid libraries BAC
libraries Cosmid libraries BAC libraries Cosmid libraries BAC
libraries Cosmid libraries BAC libraries Cosmid libraries BAC
Gen Bank Gen Bank Qu es Genbank Gen
GENLER NDEKLER gen Nedir gen eitleri gen izilebilmesi
Ms Kayls Review Game NEXT NEXT NEXT NEXT
Welcome To Geographython 2019 NEXT NEXT NEXT NEXT
Next Gen LMS CUBoulder This Gen Technology Blackboard
TTA Virginia State Virginia Virginia County Brunswick Charlotte
Debok Tech co Ltd TECH Debok Tech Co
ALFA NEXT GEN TECH INDIA P LTD HVAC
2 CUL CornellColumbia Next Gen Tech Services Robert
Tutorial Semantic Digital Libraries Existing Semantic Digital Libraries
Libraries digital libraries and digital library research Lorcan
Digital Libraries David Rashty 1 Digital Libraries A
Millennium AJAX Annette Bailey University Libraries Virginia Tech
Virginia Tech University Libraries Discovery Teams Spring 2012
Benchmarking Visualization Platform The Platform Brief description Platform
Branding Platform Prologue Branding Platform Introduction Branding Platform
Next Gen EC 2020 NEXT GENERATION CHALLENGES IN
Next Gen EC 2020 NEXT GENERATION CHALLENGES IN
LOGO Arduino Libraries Arduino Libraries Arduino Libraries Dongyang
Intels Next Generation Mobile Platform CPM Intels Next
Technologies For Next Gen Digital Insurance 19 May
Technologies For Next Gen Digital Insurance 19 May