METS at UC Berkeley Generating METS Objects Background
METS at UC Berkeley Generating METS Objects
Background • Kinds of materials: – primarily imaged content & tei encoded content • archival materials: manuscripts and pictorial collections • oral histories • Kinds of Metadata – Structural metadata: physical structure – Descriptive metadata – Basic. Technical metadata about digital files and how they were produced
Tools For Producing METS Objects • Gen. DB – Gathers structural, descriptive and technical metadata • Gen. X – Generates METS objects from Gen. DB
Gen. DB • Consists of: – Relational database (Currently SQL Server) – Locally developed software for gathering metadata and facilitating digital processing
Gen. DB Database Structural Metadata Structural Md Table Object 1 Div 1 (root) Div 2 (parent = div 1) Div 3 (parent = div 1) … Object 2 Div 1 (root) Div 2 (parent = div 1) Div 3 (parent = div 2) Div 4 (parent = div 2) Object 1 Div 2 Div 3 Object 2 Div 1 Div 2 Div 3 Div 4
Gen. DB Database Structure Descriptive Metadata Structural Md Table Object 1 Div 1 Core Desc Md Div 2 Core Desc Md Div 3 Core Desc Md Name Table Name 1 Object 2 Div 1 Div 2 Div 3 Div 4 Note Tables Note 1 Core Desc Md Name 2 Name 3 Note 2 Note 3
Gen. DB Database Structure Content File/Technical Md Structural Md Table Object 1 Div 2 Div 3 Master Image Table Mstr 1 Technical Md Mstr 2 Technical Md Derivative Image Table Drv 1 Technical Md Drv 2 Technical Md Drv 3 Technical Md Drv 4 Technical Md
Populating the Database Tables • Web interface: manual input of structural and descriptive metadata • Digitization Management modules – Generate work orders to guide digitization process – Import content file information and technical metadata coming out of digitization process • Batch loader: batch input based on TEI encodings, legacy metadata
Web Interface: Web. Gen. DB Web Interfac e Java Server jdbc rmi XML Config Files Java Servlet SQL Server Database
Digitization Management Modules Vendor Web Interfac e Imaging/ Transcription Work. Orders Technical MD Spreadsheets Java Server Java Servlet SQL Server Database
Batch Loader Web Interfac e SQL Server Database Java Server TEI Docs XSLT Java Servlet Java Batch Loader XML Batch Load File
Web. Gen. DB The concepts that drove the design • • • Shielding user from METS complexity Highly configurable Unicode support Access driven by login privileges Use of Open Source software and components • Distributed approach
XML Configuration Files • Three levels – Common to all projects elements – Common to all screens in a project elements – Specific to a screen in a project • Define fields common to all projects • Define fields used in specific project • Define screens by project & object type
Relation among XML files Object. Type 1. xml Proj 1. xml Object. Type 2. xml Al. Projects. xml Object. Type 1. xml Proj 2. xml Object. Type 2. xml
Project XML file example <Object. Type> <name>workorder</name> <file. Location> /data/_w/Gen. DB/WEB-INF/classes/edu/berkeley/library/property. Files/Cal. Culture. Work. Orde </file. Location> </Object. Type> <Field> <name>Image</name><type>checkbox</type><label>Image </label><size>1</size> </Field> <name>Text</name><type>checkbox</type><label>Text </label><size>1</size> </Field> <name>Title</name><type>text</type><label>Title </label><size>60</size> </Field>
Software used • • MSSQL running on NT Tomcat 4. 1. 2 implementing servlets 2. 3 Jsdk 1. 4 Xalan 2. 4 Xerces 1. 0. 3 FOP 0. 12. 1 JDOM beta 8 Opta 2000
Relationship of Gen. DB to METS • Metadata not directly stored in METS, MODS or MIX schema formats. – Much of the database structure was developed before these standards emerged – Database structure and content adjusted to be compatible with all these formats
Gen. X: From Gen. DB to METS • Allows Digital Publishing Group staff to select the objects in the Gen. DB database that are ready for export and to export them as METS objects.
Gen. X Architecture App Interfac e Java Application JDBC Gen. DB METS XML Repository
Gen. X Output • METS output corresponding to version 1. 3 • Descriptive metadata exported to METS desc. MD in MODS 2. 0 format • Technical Metadata exported to METS tech. MD in MIX format • Planned: – Text technical md to METS desc. MD in NYU Text. MD – Rights to METS rights. MD in ODRL subset
Links • Gen. DB Web Interface Demo – http: //sunsite 2. berkeley. edu/Gen. D – login: demo – password: demo • Developers: – rbeaubie@library. berkeley. edu – ghill@library. berkeley. edu – jhassan@library. berkeley. edu
- Slides: 21