Conceptual Modeling for XML Data Tok Wang Ling

  • Slides: 20
Download presentation
Conceptual Modeling for XML Data Tok Wang Ling National University of Singapore DASFAA’ 2003

Conceptual Modeling for XML Data Tok Wang Ling National University of Singapore DASFAA’ 2003 Panel Discussion 1

Outlines • Why do we need conceptual modeling? • What are the important semantic

Outlines • Why do we need conceptual modeling? • What are the important semantic information to be captured? • Uses of conceptual model for some XML research topics 2

Motivation: Why do we need to have a conceptual model to represent XML Data?

Motivation: Why do we need to have a conceptual model to represent XML Data? <department number =“cs”> <name> computer science</name> <course number = “cs 4221”> <name> Database </name> <student number = “ 1234” > <name> B. Y. Smith</name> <grade> 70</grade> </student> <student number=“ 1235”> <name> C. U. Brown </name> <grade> 60</grade> </student> </course> </department> <! ELEMENT department (name, course+)> <! ATTLIST department number ID #REQUIRED> <! ELEMENT course (name, student*)> <! ATTLIST course number ID #REQUIRED> <! ELEMENT student (name, grade? )> <! ATTLIST student number CDATA #REQUIRED> <! ELEMENT name (#PCDATA)> <! ELEMENT grade (#PCDATA)> (b) An XML DTD for (a) XML document 3

▼♦ department Motivation (cont. ) department &1 number cs ♦ number ♦ name ▼

▼♦ department Motivation (cont. ) department &1 number cs ♦ number ♦ name ▼ ♦ course ♦ number ♦ name ▼♦ student ♦ number ♦ name ♦ grade (B) Dataguide name course computer science &3 &2 number name student &4 &5 cs 4221 Database number &23 1234 &7 &6 name grade &24 number name student &20 &21 &22 number name grade &25 &26 &27 &28 B. Y. Smith 70 1235 C. U. Brown 60 (a) OEM Diagram Figure 1: Sample instance demonstrating OEM and dataguide 4

Motivation (continue) Q: What are the important semantic information and constraints cannot be captured

Motivation (continue) Q: What are the important semantic information and constraints cannot be captured by the DTD and Dataguide? • What are the object classes? department, course, student? • Attributes of object classes? • Identifiers of object classes? • What are the relationship types defined among object classes? e. g. Relationship types among department, course, student? • What is “grade”? Object class? Attribute of student? • Are there redundancies? 5

Semantic Information to be captured by an XML conceptual model • Object class –

Semantic Information to be captured by an XML conceptual model • Object class – attributes of object class – ordering on object class • Relationship Type – – – Represent hierarchical structure degree of n-ary relationship type participation constraints of object classes in relationship type attributes of relationship type disjunctive relationship type recursive relationship type • Reference 6

Semantic Information to be captured by an XML conceptual model (cont. ) • Attribute

Semantic Information to be captured by an XML conceptual model (cont. ) • Attribute – key attribute / identifier – composite attribute – disjunctive attribute – attributes with unknown structure – fixed and default values of attribute – derived attribute • Functional dependencies and other constraints • Inheritance hierarchy (class hierarchy) • Semi-structured data instance representation 7

Q: What are the semantic information cannot be represented by Dataguide, DTD, XML Schema?

Q: What are the semantic information cannot be represented by Dataguide, DTD, XML Schema? • • • Attribute or object class Degree of relationship type Attibute of object class or relationship type Class hierarchy Functional dependency … 8

A solution: ORA-SS, an object-relationship attribute model for semi-structured data. department Number: CS Name:

A solution: ORA-SS, an object-relationship attribute model for semi-structured data. department Number: CS Name: Computer Science course student Number: Name: CS 4221 Database student Number Name student department 2, 1: n, 1: 1 number: Name: Grade: 1234 B. Y. Smith 70 number: Name: 1235 C. U. Brown Grade: 60 number name course number name Figure 2: ORA-SS instance diagram cs student cs number name grade Figure 3: ORA-SS schema diagram 9

The data model of ORA-SS – – – - Relationship Type attributes of relationship

The data model of ORA-SS – – – - Relationship Type attributes of relationship type degree of n-ary relationship type participation constraints of objects in relationship type disjunctive relationship type recursive relationship type project 2, +, + id name project member publication member p 1 2, *, + name job title publication m 1 pub 2 p 3 pub 1 m 2 pub 3 number title (a) ORA-SS Schema Diagram (b) Instance Relationship ▼♦ project ♦ id ♦ name ▼♦ member ♦ name ♦ job title ▼♦ publication ♦ number ♦ title (c) Dataguide Figure 5: Representing binary relationship type 10

The data model of ORA-SS - Relationship Type (cont. ) project id name member

The data model of ORA-SS - Relationship Type (cont. ) project id name member p 1 2, +, + p 2 member p 3 m 1 publication pub 1 pub 2 m 2 pub 3 3, *, + name job title publication (b) Instance Relationship ▼♦ project number title (a) ORA-SS Schema Diagram ♦ id ♦ name ▼♦ member ♦ name ♦ job title ▼♦ publication ♦ number ♦ title (c) Dataguide Figure 6: Representing ternary relationship type 11

The data model of ORA-SS – – – - Attribute key attribute composite attribute

The data model of ORA-SS – – – - Attribute key attribute composite attribute disjunctive attribute with unknown structure fixed and default values of attribute derived attribute course code title * ANY cs 2, 4: n, 3: 8 student cs dept prefix number D: comp cs * number first last mark grade hobby name Figure 7: Object classes with relationship type and attributes 12 in an ORA-SS schema diagram

Uses of the Conceptual model for XML research • Normal form XML schema –

Uses of the Conceptual model for XML research • Normal form XML schema – remove redundant data – resolve multiple inheritance conflicts • Storage structure for XML databases – use Object Relational Model • XML Views – – derived information from references and class hierarchy defining views materialized view maintenance view updates • Integration of XML documents • Evaluating XML queries on XML databases 13

Research Topics using ORA-SS Model Normal Form XML Schema • • • Schema may

Research Topics using ORA-SS Model Normal Form XML Schema • • • Schema may have a lot of redundant data Update anomalies Normal Form schema is needed professor staff# 2 name course 2 C. Code title textbook + ISBN author title 14

Research Topics using ORA-SS Model NF XML Schema (cont. ) • Some better solutions:

Research Topics using ORA-SS Model NF XML Schema (cont. ) • Some better solutions: • Redundancies are removed, in normal form professor staff# name course C. Code course. R course-Ref professor name textbook-Ref title textbook. R + ISBN author title course professor-Ref staff# textbook-Ref C. Code title professor. R textbook. R + ISBN author title 15

Research Topics using ORA-SS Model Storage Structure for XML Databases • Main Rules –

Research Topics using ORA-SS Model Storage Structure for XML Databases • Main Rules – Each object class together with its attributes form a nested relation (object relation) – Each relationship type together with its attributes form a nested relation (relationship relation) • Nested relations can be handled by Object Relational model, e. g. ORACLE 8 i. 16

Research Topics using ORA-SS Model Storage Structure for XML Databases Object Relations Supplier S#

Research Topics using ORA-SS Model Storage Structure for XML Databases Object Relations Supplier S# + Name City SP Part P# Name Supplier (S#, Name, (City)) SP 2 Color Part (P#, Name, color) Price SPJ 3 Relationship relations SP (S#, P#, price) Project SPJ J# Name Loc Project(J#, Name, Loc) Qty SPJ (S#, P#, J#, Qty) Constraint: SPJ[S#, P#] SP[S#, P#] 17

Research Topics using ORA-SS Model XML Views • What information can be directly derived

Research Topics using ORA-SS Model XML Views • What information can be directly derived from references and class hierarchy code title course faculty course cs, 2, 4: n, 3: 8 student. R fs, faculty 2, 1: n name student code grade student number cs | * faculty grade history course grade code title student fs cs cs grade student number | hostel home name number street Referencing an object class in an ORA-SS schema diagram hostel name home number street 18

Research Topics using ORA-SS Model XML Views (cont. ) • Valid views of an

Research Topics using ORA-SS Model XML Views (cont. ) • Valid views of an ORA-SS schema • Operations: selection, projection, join, up/down 2 2 Part 3 2 2 Supplier 2 3 price Project Part Supplier Project Part 2 2 3 Supplier 3 3 3 Qty Qty View 1 View 2 2 total_qty price View 3 19

Conclusion • A good conceptual model is needed for XML database applications: * normal

Conclusion • A good conceptual model is needed for XML database applications: * normal form schema * storage structure * view design and view updates * …. 20