Management of XML Documents in ObjectRelational Databases Thomas
Management of XML Documents in Object-Relational Databases Thomas Kudrass Matthias Conrad HTWK Leipzig EDBT-Workshop XML-Based Data Management Prague, 24 March 2002
© T. Kudrass, HTWK Leipzig Overview l l l l Motivation Object-Relational Database Concepts Parsing XML Documents XML-to-ORDB Mapping Meta-Data Special Issues Conclusions
© T. Kudrass, HTWK Leipzig Motivation l l l Storing of XML documents in DBMS Use existing database technology Dealing with complex objects: – – – XML documents = complex objects avoid any decomposition object-relational database technology good choice to represent complex objects
© T. Kudrass, HTWK Leipzig User-Defined Types in ORDB l Complex Data Types – – l l Object Type Collection Type Object References Object Views
© T. Kudrass, HTWK Leipzig Example: Object Types CREATE TYPE Type_Professor AS OBJECT ( PName VARCHAR(80), Subject VARCHAR(120) ); object-valued attribute object table CREATE TYPE Type_Course AS OBJECT ( CREATE TABLE Tab. Professor OF Name VARCHAR(100), Type_Professor; Professor Type_Professor );
© T. Kudrass, HTWK Leipzig Example: Collection Types CREATE TYPE Type_Professor AS OBJECT ( PName VARCHAR(80), Subject VARCHAR(120) ); Array Nested Table CREATE TYPE Type. Va_ Professor AS CREATE TYPE Type_Tab. Professor AS VARRAY(5) OF Type_Professor; TABLE OF Type_Professor; CREATE TABLE Tab. Dept ( DName VARCHAR(80), Professor Type_Tab. Professor ) NESTED TABLE Professor STORE AS Tab. Professor_List;
© T. Kudrass, HTWK Leipzig Example: Object References CREATE TYPE Type_Professor AS OBJECT ( PName VARCHAR(80), Dept VARCHAR(120) ); CREATE TABLE Tab. Professor OF Type_Professor; CREATE TYPE Type_Course AS OBJECT ( Name VARCHAR(200), Prof_Ref REF Type_Professor ); CREATE TABLE Tab. Course OF Type_Course; Reference to objects of object table Tab. Professor
© T. Kudrass, HTWK Leipzig Parsing DTD and XML Document Well-Formedness Validity Check -------------------------------------------------------------------------------- DTD Syntax Check XML V 2 Parser DTD Parser XML DOM Tree DTD DOM Tree Schema Definition XML 2 Oracle JDBC / ODBC DBMS Oracle
© T. Kudrass, HTWK Leipzig 1 <!ELEMENT 2 <!ELEMENT 3 <!ATTLIST 4 <!ELEMENT 5 <!ELEMENT 6 <!ENTITY 7 <!ELEMENT 8 <!ELEMENT 9 <!ELEMENT 10 <!ELEMENT 11 <!ELEMENT 12 <!ELEMENT 13 <!ELEMENT 14 <!ELEMENT University (Study. Course, Student*)> Student (LName, FName, Course*)> Student Stud. Nr CDATA #REQUIRED> Course (Name, Professor*, Credit. Pts? )> Professor (PName, Subject+, Dept)> cs “Computer Science“> LName (#PCDATA)> FName (#PCDATA)> Credit. Pts (#PCDATA)> PName (#PCDATA)> Subject (#PCDATA)> Dept (#PCDATA)> Study. Course (#PCDATA)>
© T. Kudrass, HTWK Leipzig Object–Based–Mapping DTD <!ELEMENT A C D B (B, C)> (D)> (#PCDATA)> Classes CLASS A { STRING b; C c; CLASS C { STRING d; } Tables CREATE TABLE A ( a_pk INTEGER NOT b VARCHAR(30) CREATE TABLE C ( c_pk INTEGER NOT a_fk INTEGER NOT d VARCHAR(10) NULL, NOT NULL); • Modification of the Mapping Algorithm [Bourret] No class definitions Use objects of the DTD tree
© T. Kudrass, HTWK Leipzig Step 1 1 <!ELEMENT 2 <!ELEMENT 3 <!ATTLIST 4 <!ELEMENT 5 <!ELEMENT 6 <!ENTITY 7 <!ELEMENT 8 <!ELEMENT 9 <!ELEMENT 10 <!ELEMENT 11 <!ELEMENT 12 <!ELEMENT 13 <!ELEMENT 14 <!ELEMENT University (Study. Course, Student*)> Student (LName, FName, Course*)> Student Stud. Nr CDATA #REQUIRED> Course (Name, Professor*, Credit. Pts? )> Professor (PName, Subject+, Dept)> cs “Computer Science“> LName (#PCDATA)> FName (#PCDATA)> Credit. Pts (#PCDATA)> PName (#PCDATA)> Subject (#PCDATA)> Dept (#PCDATA)> Study. Course (#PCDATA)> • Each Complex Element Table • Each Set-Valued Element Table • Primary Key in each Table 1 <!ELEMENT University (Study. Course, Student*)> CREATE TABLE Tab. University ( IDUniversity 2 <!ELEMENT Student (LName, FName, Course*)> CREATE TABLE Tab. Student ( IDStudent 4 <!ELEMENT Course (Name, Professor*, Credit. Pts? )> CREATE TABLE Tab. Course ( IDCourse 5 <!ELEMENT Professor (PName, Subject+, Dept)> CREATE TABLE Tab. Professor ( IDProfessor CREATE TABLE Tab. Subject ( IDSubject
© T. Kudrass, HTWK Leipzig Step 2 1 <!ELEMENT 2 <!ELEMENT 3 <!ATTLIST 4 <!ELEMENT 5 <!ELEMENT 6 <!ENTITY 7 <!ELEMENT 8 <!ELEMENT 9 <!ELEMENT 10 <!ELEMENT 11 <!ELEMENT 12 <!ELEMENT 13 <!ELEMENT 14 <!ELEMENT University (Study. Course, Student*)> Student (LName, FName, Course*)> Student Stud. Nr CDATA #REQUIRED> Course (Name, Professor*, Credit. Pts? )> Professor (PName, Subject+, Dept)> cs “Computer Science“> LName (#PCDATA)> FName (#PCDATA)> Credit. Pts (#PCDATA)> PName (#PCDATA)> Subject (#PCDATA)> Dept (#PCDATA)> Study. Course (#PCDATA)> Other Elements & Attributes Table Columns CREATE TABLE Tab. University ( IDUniversity, attr. Study. Course, CREATE TABLE Tab. Student ( IDStudent, attr. Stud. Nr, attr. LName, attr. FName, CREATE TABLE Tbl. Matrikel. Nr ( IDMatrikel. Nr, attr. MNummer, CREATE TABLE Tab. Course ( IDCourse, attr. Name, attr. Credit. Pts, CREATE TABLE Tab. Professor ( IDProfessor, attr. PName, attr. Dept, CREATE TABLE Tab. Subject ( IDSubject, attr. Subject,
© T. Kudrass, HTWK Leipzig Step 3 Relationships between Elements Foreign Keys CREATE TABLE Tab. University ( IDUniversity INTEGER NOT NULL, attr. Study. Course VARCHAR(4000) NOT NULL, PRIMARY KEY (IDUniversity)); CREATE TABLE Tab. Student ( IDStudent INTEGER NOT NULL, IDUniversity INTEGER NOT NULL, attr. Stud. Nr VARCHAR(4000) NOT NULL, attr. LName VARCHAR(4000) NOT NULL, attr. FName VARCHAR(4000) NOT NULL, PRIMARY KEY (IDStudent), CONSTRAINT con. Matrikel FOREIGN KEY (IDUniversity) REFERENCES Tab. University (IDUniversity)); . . .
© T. Kudrass, HTWK Leipzig ORDBS Oracle and XML l Basic Idea: – – l Generate an object-relational schema from the DTD Natural representation of an XML document by combining user-defined types Different Mapping Rules: – – Simple elements Complex elements Set-valued elements Complex set-valued elements
© T. Kudrass, HTWK Leipzig XML Attributes & Simple Elements l Elements of #PCDATA type and XML attributes Attributes of the object type l Domain of Simple Elements: – No type information in the DTD: l l – l numeric vs. alphanumeric? length? Restrictions of the DBMS (e. g. VARCHAR [Oracle] 4000 characters) Mapping of an XML attribute of a simple element Definition of an object type for both attribute and element
© T. Kudrass, HTWK Leipzig XML Attributes & Simple Elements <!ELEMENT Professor (PName, Subject, Dept)> <!ATTLIST Professor PAddress CDATA #REQUIRED> <!ELEMENT PName (#PCDATA)> <!ELEMENT Subject (#PCDATA)> <!ELEMENT Dept (#PCDATA)> <!ATTLIST Dept DAddress CDATA #REQUIRED> CREATE TABLE Tab. Professor OF Type_Professor; CREATE TYPE Type_Professor AS OBJECT ( attr PAddress VARCHAR(4000), attr. PName VARCHAR(4000), attr. Subject VARCHAR(4000), attr. Dept Type_Dept); CREATE TYPE Type_Dept AS OBJECT ( attr. Dept VARCHAR(4000), attr. DAddress VARCHAR(4000));
© T. Kudrass, HTWK Leipzig Complex Elements Nesting of elements by composite DB object types CREATE TABLE Tab. University ( INSERT INTO Tab. University attr. Study. Course VARCHAR(4000), VALUES ( ‘Computer Science' , attr. Student Type_Matrikel ); Type_Student('23374', 'Conrad', 'Matthias', Type_Course(‘Databases II‘, CREATE TYPE Type_Student AS OBJECT ( Type_Professor(‘Kudrass‘ , attr. Stud. Nr VARCHAR(4000), ‘Database Systems‘', attr. LName VARCHAR(4000), ‘Computer Science‘), '4'))); attr. FName VARCHAR(4000), attr. Course Type_Vorlesung ); SELECT u. attr. Student. attr. Lname CREATE TYPE Type_Course AS OBJECT ( FROM Tab. University u attr. Name VARCHAR(4000), WHERE u. attr. Student. attr. Course. attr. Professor. attr. PName attr. Professor. Type_Professor, = ‘Kudrass'; attr. Credit. Pts VARCHAR(4000)); CREATE TYPE Type_Professor AS OBJECT ( attr. PName VARCHAR(4000), attr. Subject VARCHAR(4000), attr. Dept VARCHAR(4000));
© T. Kudrass, HTWK Leipzig Set-Valued Elements l l Multiple Occurrence (in DTD): marked by + or * DBMS Restrictions – – collection type applicable to set-valued elements with textvalued subelements, e. g. ARRAY OF VARCHAR collection type not applicable to set-valued elements with complex subelements l l subelements may be set-valued again Solutions – – use newer DBMS releases (e. g. Oracle 9 i) model relationships with object references
© T. Kudrass, HTWK Leipzig Set-Valued Elements <!ELEMENT University (Study. Course, Student*)> CREATE TYPE Type_Student AS OBJECT ( attr. Jahrgang VARCHAR(4000) , attr. University REF Type_University ); CREATE TABLE Tab. Student OF Type_Student; CREATE TYPE Type_University AS OBJECT( attr. Study. Course VARCHAR(4000)); CREATE TABLE Tab. University OF Type_University; Reference to University Objects • Set-valued element Student • Modeling in object type Type_Student with a reference to objects of the table Tab. University
© T. Kudrass, HTWK Leipzig Set-Valued Elements CREATE TYPE Type. VA_Course AS VARRAY(100) OF Type_Course; CREATE TYPE Type. VA_Professor AS VARRAY(100) OF Type_Professor; CREATE TYPE Type. VA_Subject AS VARRAY(100) OF VARCHAR(4000); CREATE TABLE Tab. University ( attr. Study. Course VARCHAR(4000), attr. Student Type_Matrikel ); CREATE TYPE Type_Student AS OBJECT ( attr. Stud. Nr VARCHAR(4000), attr. LName VARCHAR(4000), attr. FName VARCHAR(4000), attr. Course Type_Vorlesung ); CREATE TYPE Type_Course AS OBJECT ( attr. Name VARCHAR(4000), attr. Professor. Type_Professor, attr. Credit. Pts VARCHAR(4000)); CREATE TYPE Type_Professor AS OBJECT ( attr. PName VARCHAR(4000), attr. Subject VARCHAR(4000), attr. Dept VARCHAR(4000));
© T. Kudrass, HTWK Leipzig Set-Valued Elements Example INSERT INTO Tab. University VALUES ( ‘Computer Science' , Type. VA_Student ( Type_Student('23374', 'Conrad', 'Matthias', Type. VA_Course ( Type_Course(‘Databases II‘, Type. VA_Professor ( Type_Professor(‘Kudrass‘ , Type. VA_Subject ( ‘Database Systems, ‘Operating Systems‘), ‘Computer Science‘)), ‘ 4‘), Type_Course(‘CAD Intro‘, Type. VA_Professor ( Type_Professor(‘Jaeger‘ , Type. VA_Subject ( ‘CAD‘, ‘CAE‘), ‘Computer Science‘)), ‘ 4‘), . . . )), Type_Student(‘ 00011', ‘Meier', ‘Ralf', … ). . . );
© T. Kudrass, HTWK Leipzig Dealing with Null Values l Restrictions with NOT NULL constraints in object-relational DB schema – – l Object-valued attributes: – l NOT NULL constraints in table - not in object type! NOT NULL constraints not applicable to collection types use CHECK constraints for NOT NULL Loss of DTD semantics DTD in the database
© T. Kudrass, HTWK Leipzig Dealing with CHECK Constraints <!ELEMENT Course (CName, Address? )> <!ELEMENT Addresse (Street, City? )> CREATE TYPE Type_Address AS OBJECT ( attr. Street VARCHAR(4000), attr. City VARCHAR(4000)); CREATE TYPE Type_Course AS OBJECT ( attr. Name VARCHAR(4000), attr. Address Type_Address); CREATE TABLE Tab. Course OF Type_Course ( attr. Name NOT NULL, CHECK (attr. Adresse. attr. Strasse IS NOT NULL)); // ORA-02290: Desired error message 1. INSERT INTO Tab. Course ( VALUES (‘CAD Intro’, Type_Address (NULL, ’Leipzig’); // ORA-02290: Undesired error message 2. INSERT INTO Tab. Course ( VALUES ('RN', NULL)
© T. Kudrass, HTWK Leipzig Meta-Data about XML Documents l l l Unique Document. ID for each Document Prolog Information Document Location (URL) Name Space Element vs. Attribute
© T. Kudrass, HTWK Leipzig Naming Conventions for DB Objects l Rules: – – – l l l Tab. Elementname Table Name Type_Elementname Object Type Name Type. Va_Elementname Array Name No Conflicts with Keywords Introduction of a Schema ID Naming Rule: Schema. ID + Naming Convention + Name CREATE TYPE DTD 01_Type_University CREATE TYPE DTD 02_Type_University AS OBJECT ( AS OBJECT ( attr. Study. Course VARCHAR(4000) ); attr. Register VARCHAR(4000) );
© T. Kudrass, HTWK Leipzig Conclusions: Advantages l Non-atomic domains possible – – l l Natural representation of XML Documents Nesting of any complexity possible Simple queries by using dot notation Using object references to represent relationships (OIDs)
© T. Kudrass, HTWK Leipzig Conclusions: Drawbacks l Mapping Deficiencies – – l Loss of Information – – – l Prolog, Comments, Processing Instructions, Prolog Entity References Attribute vs. Element ? Schema Evolution – l Possible restrictions of element types in collections No adequate mapping of NOT NULL constraints Modification of DTD Modification of DB Type Information – Target type: VARCHAR - not sufficient!
© T. Kudrass, HTWK Leipzig Outlook l l Graph-based creation of a schema Source: XML Schema Use CLOB datatype Enhance Meta-Schema – – Comments, Processing Instructions and their position in document Entity references and their substitution text
- Slides: 28