CSE 636 Data Integration XML Schema XML Schemas

  • Slides: 24
Download presentation
CSE 636 Data Integration XML Schema

CSE 636 Data Integration XML Schema

XML Schemas • W 3 C Recommendation: http: //www. w 3. org/XML/Schema • Generalizes

XML Schemas • W 3 C Recommendation: http: //www. w 3. org/XML/Schema • Generalizes DTDs • Uses XML syntax • Two documents: structure and datatypes – http: //www. w 3. org/TR/xmlschema-1 – http: //www. w 3. org/TR/xmlschema-2 • XML-Schema is very complex – often criticized – some alternative proposals 2

XML Schemas <? xml version="1. 0"? > <xsd: schema xmlns: xsd="http: //www. w 3.

XML Schemas <? xml version="1. 0"? > <xsd: schema xmlns: xsd="http: //www. w 3. org/2001/XMLSchema"> <xsd: element name=“paper” type=“papertype”/> <xsd: complex. Type name=“papertype”> <xsd: sequence> <xsd: element name=“title” type=“xsd: string”/> <xsd: element name=“author” min. Occurs=“ 0”/> <xsd: choice> <xsd: element name=“journal”/> <xsd: element name=“conference”/> </xsd: choice> </xsd: sequence> </xsd: complex. Type> </xsd: schema> DTD: <!ELEMENT paper (title, author? , (journal|conference))> 3

XML Namespaces • http: //www. w 3. org/TR/REC-xml-names (1/99) • Solve the problem of

XML Namespaces • http: //www. w 3. org/TR/REC-xml-names (1/99) • Solve the problem of tag name conflicts • name : : = [prefix: ]localpart <book xmlns: isbn=“www. isbn-org. org/def”> <title> … </title> <number> 15 </number> <isbn: number> …. </isbn: number> </book> 4

XML Namespaces • Syntactic: <number> , <isbn: number> • Semantic: provide URL for schema

XML Namespaces • Syntactic: <number> , <isbn: number> • Semantic: provide URL for schema <tag xmlns: mystyle = “http: //…”> … defined here <mystyle: title> … </mystyle: title> <mystyle: number> … </tag> 5

Elements vs. Types in XML Schema <xsd: element name=“person”> <xsd: complex. Type> <xsd: sequence>

Elements vs. Types in XML Schema <xsd: element name=“person”> <xsd: complex. Type> <xsd: sequence> <xsd: element name=“name” type=“xsd: string”/> <xsd: element name=“address” type=“xsd: string”/> </xsd: sequence> </xsd: complex. Type> </xsd: element> <xsd: element name=“person” type=“p. Type”/> <xsd: complex. Type name=“p. Type”> <xsd: sequence> <xsd: element name=“name” type=“xsd: string”/> <xsd: element name=“address” type=“xsd: string”/> </xsd: sequence> </xsd: complex. Type> DTD: <!ELEMENT person (name, address)> 6

Elements vs. Types in XML Schema • Types: – Simple types (integers, strings, .

Elements vs. Types in XML Schema • Types: – Simple types (integers, strings, . . . ) – Complex types (regular expressions, like in DTDs) • Element-type-element alternation: – – – Root element has a complex type That type is a regular expression of elements Those elements have their complex types. . . On the leaves we have simple types 7

Local vs. Global Types in XML Schema • Local type: <xsd: element name=“person”> [define

Local vs. Global Types in XML Schema • Local type: <xsd: element name=“person”> [define locally the person’s type] </xsd: element> • Global type: <xsd: element name=“person” type=“p. Type”/> <xsd: complex. Type name=“p. Type”> [define here the type p. Type] </xsd: complex. Type> Global types: can be reused in other elements 8

Local vs. Global Elements in XML Schema • Local element: <xsd: complex. Type name=“p.

Local vs. Global Elements in XML Schema • Local element: <xsd: complex. Type name=“p. Type”> <xsd: sequence> <xsd: element name=“address” type=“. . . ”/>. . . </xsd: sequence> </xsd: complex. Type> • Global element: <xsd: element name=“address” type=“. . . ”/> <xsd: complex. Type name=“p. Type”> <xsd: sequence> <xsd: element ref=“address”/>. . . </xsd: sequence> </xsd: complex. Type> Global elements: like in DTDs 9

Regular Expressions in XML Schema Recall the element-type-element alternation: <xsd: complex. Type name=“. .

Regular Expressions in XML Schema Recall the element-type-element alternation: <xsd: complex. Type name=“. . ”> [regular expression on elements] </xsd: complex. Type> Regular expressions: <xsd: sequence> A B C </. . . > =ABC <xsd: choice> A B C </. . . > =A|B|C <xsd: group> A B C </. . . > = (A B C) <xsd: … min. Occurs=“ 0” max. Occurs=“unbounded”>…</…>= (. . . )* <xsd: … min. Occurs=“ 0” max. Occurs=“ 1”>…</…> = (. . . )? 10

Local Names in XML Schema name has different meanings in personand in product <xsd:

Local Names in XML Schema name has different meanings in personand in product <xsd: element name=“person”> <xsd: complex. Type> … <xsd: element name=“name”> <xsd: complex. Type> <xsd: sequence> <xsd: element name=“firstname” type=“xsd: string”/> <xsd: element name=“lastname” type=“xsd: string”/> </xsd: sequence> </xsd: complex. Type> </xsd: element> … </xsd: complex. Type> </xsd: element> <xsd: element name=“product”> <xsd: complex. Type> … <xsd: element name=“name” type=“xsd: string”/> </xsd: complex. Type> </xsd: element> 11

Subtle Use of Local Names <xsd: element name=“A” type=“one. B”/> <xsd: complex. Type name=“only.

Subtle Use of Local Names <xsd: element name=“A” type=“one. B”/> <xsd: complex. Type name=“only. As”> <xsd: choice> <xsd: sequence> <xsd: element name=“A” type=“only. As”/> </xsd: sequence> <xsd: element name=“A” type=“xsd: string”/> </xsd: choice> </xsd: complex. Type> <xsd: complex. Type name=“one. B”> <xsd: choice> <xsd: element name=“B” type=“xsd: string”/> <xsd: sequence> <xsd: element name=“A” type=“only. As”/> <xsd: element name=“A” type=“one. B”/> </xsd: sequence> <xsd: element name=“A” type=“one. B”/> <xsd: element name=“A” type=“only. As”/> </xsd: sequence> </xsd: choice> </xsd: complex. Type> Arbitrary deep binary tree with A elements, and a single B element 12

Attributes in XML Schema <xsd: element name=“paper” type=“papertype”/> <xsd: complex. Type name=“papertype”> <xsd: sequence>

Attributes in XML Schema <xsd: element name=“paper” type=“papertype”/> <xsd: complex. Type name=“papertype”> <xsd: sequence> <xsd: element name=“title” type=“xsd: string”/> … </xsd: sequence> <xsd: attribute name=“language" type="xsd: NMTOKEN" fixed=“English"/> </xsd: complex. Type> • Attributes are associated to the type, not to the element • Only to complex types • More trouble if we want to add attributes to simple types 13

Adding Attributes to Simple Types <xsd: element name="B"> <xsd: complex. Type> <xsd: simple. Content>

Adding Attributes to Simple Types <xsd: element name="B"> <xsd: complex. Type> <xsd: simple. Content> <xsd: extension base="xsd: string"> <xsd: attribute name="test. Attr“ type="xsd: string"/> </xsd: extension> </xsd: simple. Content> </xsd: complex. Type> </xsd: element> 14

“Mixed” Content, “Any” Type <xsd: complex. Type mixed="true"> … • Better than in DTDs:

“Mixed” Content, “Any” Type <xsd: complex. Type mixed="true"> … • Better than in DTDs: can still enforce the type, but now may have text between any elements <xsd: element name="anything" type="xsd: any. Type"/> … • Means anything is permitted there 15

“All” Group <xsd: complex. Type name="Purchase. Order. Type"> <xsd: all> <xsd: element name="ship. To"

“All” Group <xsd: complex. Type name="Purchase. Order. Type"> <xsd: all> <xsd: element name="ship. To" type="USAddress"/> <xsd: element name="bill. To" type="USAddress"/> <xsd: element ref="comment" min. Occurs="0"/> <xsd: element name="items" type="Items"/> </xsd: all> <xsd: attribute name="order. Date" type="xsd: date"/> </xsd: complex. Type> • Restrictions: – Only at top level – Has only elements – Each element occurs at most once • E. g. “comment” occurs 0 or 1 times 16

Derived Types by Extensions <complex. Type name="Address"> <sequence> <element name="street" type="string"/> <element name="city" type="string"/>

Derived Types by Extensions <complex. Type name="Address"> <sequence> <element name="street" type="string"/> <element name="city" type="string"/> </sequence> </complex. Type> <complex. Type name="USAddress"> <complex. Content> <extension base="ipo: Address"> <sequence> <element name="state" type="ipo: USState"/> <element name="zip" type="positive. Integer"/> </sequence> </extension> </complex. Content> </complex. Type> • Corresponds to inheritance 17

Derived Types by Restrictions <complex. Content> <restriction base="ipo: Items“> … [rewrite the entire content,

Derived Types by Restrictions <complex. Content> <restriction base="ipo: Items“> … [rewrite the entire content, with restrictions]… </restriction> </complex. Content> • may restrict cardinalities, e. g. (0, infty) to (1, 1) • may restrict choices • other restrictions… • Corresponds to set inclusion 18

Simple Types • • string token byte unsigned. Byte integer positive. Integer int •

Simple Types • • string token byte unsigned. Byte integer positive. Integer int • • time date. Time duration date ID IDREFS – larger than integer • • unsigned. Int long short. . . 19

Facets of Simple Types • Facets = additional properties restricting a simple type •

Facets of Simple Types • Facets = additional properties restricting a simple type • 15 facets defined by XML Schema Examples • length • min. Length • max. Length • pattern • enumeration • white. Space • • • max. Inclusive max. Exclusive min. Inclusive min. Exclusive total. Digits fraction. Digits 20

Facets of Simple Types • Can further restrict a simple type by changing some

Facets of Simple Types • Can further restrict a simple type by changing some facets • Restriction = subset 21

Not so Simple Types • List types: <xsd: simple. Type name="list. Of. My. Int.

Not so Simple Types • List types: <xsd: simple. Type name="list. Of. My. Int. Type"> <xsd: list item. Type="my. Integer"/> </xsd: simple. Type> <list. Of. My. Int>20003 15037 95977 95945</list. Of. My. Int> • Union types • Restriction types 22

Summary of XML Schema • Formal Expressive Power: – Can express precisely the regular

Summary of XML Schema • Formal Expressive Power: – Can express precisely the regular tree languages (over unranked trees) • Lots of other stuff – Some form of inheritance – A “null” value – Large collection of data types 23

References • Lecture Slides – Dan Suciu – http: //www. cs. washington. edu/homes/suciu/COURSES/590 DS/1

References • Lecture Slides – Dan Suciu – http: //www. cs. washington. edu/homes/suciu/COURSES/590 DS/1 2 xmlschema. htm • BRICS XML Tutorial – A. Moeller, M. Schwartzbach – http: //www. brics. dk/~amoeller/XML/index. html • W 3 C's XML Schema homepage – http: //www. w 3. org/XML/Schema • XML School – http: //www. w 3 schools. com 24