XML XML Query Ling Wang Luping Ding CS

  • Slides: 93
Download presentation
XML & XML Query Ling Wang Luping Ding CS 561 XML & XML Query

XML & XML Query Ling Wang Luping Ding CS 561 XML & XML Query 1

Introduction • The Web opens a new challenges in: - information technology - database

Introduction • The Web opens a new challenges in: - information technology - database framework. • Why? - Data sources on the Web do NOT typically conform to any well-known structure. - Traditional databases technology is not adequate in dealing with rich data: eg: audio, video, nested data structures … CS 561 XML & XML Query 2

Features of Web Data Web data characteristics, called semistructured : • Object-like a collection

Features of Web Data Web data characteristics, called semistructured : • Object-like a collection of complex objects from CODM. • Schema-less Not typically conform to any type traditional structure. • Self-describing meaning of the data is carried along with the data itself. So, we need new database technologies to support those Webbased applications. CS 561 XML & XML Query 3

What is XML? • XML---- Extensible Markup Language - A mark up language for

What is XML? • XML---- Extensible Markup Language - A mark up language for documents containing structured information. - Universal format for structured documents and data on the Web. - An HTML-like language. • XML specification defines a standard way to add markup to documents. • Note: Structured information , Markup language CS 561 XML & XML Query 4

What is XML ---- example A XML example for customer information: <customer-details id="Ac. Pharm

What is XML ---- example A XML example for customer information: <customer-details id="Ac. Pharm 39156"> <name>Acme Pharmaceuticals Co. </name> <address country="US"> <street>7301 Smokey Boulevard</street> <city>Smallville</city> <state>Indiana</state> <postal>94571</postal> </address> </customer-details> CS 561 XML & XML Query 5

XML vs. HTML? XML HTML XML is extensible Not extensible - NOT specifies semantics

XML vs. HTML? XML HTML XML is extensible Not extensible - NOT specifies semantics or tag set - Fix tag semantics and tag set - Just facility - Defined by W 3 C(the World Wide Web Consortium). XML document is well formed: - A root element. - Opening tag is followed by a matching closing tag. - Element properly nested. CS 561 Not strict required. - Tags are not required to be closed. - Browsers will forgive etc. XML & XML Query 6

Overview of XML • Mechanisms for specifying document structure: ---- a set of rules

Overview of XML • Mechanisms for specifying document structure: ---- a set of rules for structuring an XML document. DTD ---- Document type definition language (A part of XML standard ) XML Schema ---- A more recent specification • Query languages for XML: XPath , XSLT, XQuery CS 561 XML & XML Query 7

Basic concept in XML ---- element & attributes • XML element Any properly nested

Basic concept in XML ---- element & attributes • XML element Any properly nested piece of text of the form <sometag>…</sometag>. eg: <street>7301 Smokey Boulevard</street> name content • XML Attributes also a tools for datapresentation. eg: <customer-details id="Ac. Pharm 39156"> </customer-details> name CS 561 XML & XML Query Attribute Value 8

Basic concept in XML ---- namespace • Namespaces • - Why? Element names in

Basic concept in XML ---- namespace • Namespaces • - Why? Element names in XML are not fixed, name conflict. - How? Different authors use different namespace identifiers for different domains. The general structure “namespace: local-name” Namespace ---- URI (uniform resource identifier): URL (uniform resource locator) or URN (universal resource name). Local name ---- same form as regular XML tags. No a “: ” in it. CS 561 XML & XML Query 9

Basic concept in XML ---- namespace • An example of Namespaces : <item xmlns="http:

Basic concept in XML ---- namespace • An example of Namespaces : <item xmlns="http: //www. acmeinc. com/jp#supplies"> xmlns: toy=“http: //www. acmeinc. com/jp#toys”> <name>African Coffee Table</name> <feature> <toy: item> <toy: name>cyberpet</toy: name> </toy: item> </feature> default namespace </item> CS 561 XML & XML Query 10

DTD ---- Document Type Definitions • Why DTD? - XML files carry a description

DTD ---- Document Type Definitions • Why DTD? - XML files carry a description of its own format with it. - Independent groups of people can agree with interchanging data. - Application verify received data from the outside world - Also verify own data. • How? - DTD is included in your XML source file <!DOCTYPE root-element [element-declarations]> - DTD is external to your XML source file <!DOCTYPE root-element SYSTEM "filename"> CS 561 XML & XML Query 11

DTD ---- example Example XML document with a DTD: <? xml version="1. 0"? >

DTD ---- example Example XML document with a DTD: <? xml version="1. 0"? > <!DOCTYPE note [ <!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend</body> </note> CS 561 XML & XML Query 12

DTD ---- example XML document with an external DTD: <? xml version="1. 0"? >

DTD ---- example XML document with an external DTD: <? xml version="1. 0"? > <!DOCTYPE note SYSTEM "note. dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> "note. dtd" containing the DTD: <!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> CS 561 XML & XML Query 13

DTD ---- Inadequacy • Inadequacy of DTD: - Not designed with namespaces. . -

DTD ---- Inadequacy • Inadequacy of DTD: - Not designed with namespaces. . - Use syntax ---- quite different from XML document. - A very limited set of basic types - Provide only limited means for expressing data consistency constraints. No keys Referential integrity is weak: Attributes can be type ID, IDREFS. No for element. CS 561 XML & XML Query 14

DTD ---- Inadequacy • Inadequacy of DTD: - No ways of enforcing referential integrity

DTD ---- Inadequacy • Inadequacy of DTD: - No ways of enforcing referential integrity for elements. - Use alternatives to state that the order of elements is immaterial. Terrible as the number of attributes grows. - Element definitions are global to the entire document. CS 561 XML & XML Query 15

XML Schemas An attempt to solve all those problems in DTD • - Powerful

XML Schemas An attempt to solve all those problems in DTD • - Powerful data typing - Range checking - Namespace-aware validation based on namespace URIs rather than on prefixes - Extensibility and scalability CS 561 XML & XML Query 16

XML Schema ---- example • Here is a simple example about XML Schema: <?

XML Schema ---- example • Here is a simple example about XML Schema: <? xml version="1. 0"? > <xsd: schema xmlns: xsd="http: //www. w 3. org/2001/XMLSchema"> <xsd: element name="SONG" type="Song. Type"/> <xsd: complex. Type name="Song. Type"> <xsd: sequence> <xsd: element name="TITLE" type="xsd: string"/> <xsd: element name="COMPOSER" type="xsd: string"/> <xsd: element name="PRODUCER" type="xsd: string"/> <xsd: element name="PUBLISHER" type="xsd: string"/> <xsd: element name="LENGTH" type="xsd: string"/> <xsd: element name="YEAR" type="xsd: string"/> <xsd: element name="ARTIST" type="xsd: string"/> <xsd: element name="PRICE" type="xsd: string"/> </xsd: sequence> </xsd: complex. Type> </xsd: schema> CS 561 XML & XML Query 17

XML Schema ---- example • The root element ---- “schema”. • Default namespace ---http:

XML Schema ---- example • The root element ---- “schema”. • Default namespace ---http: //www. w 3. org/2001/XMLSchema with prefix xsd or xs. • Elements ---- xsd: element. divided into simple type and complex type. simple type element is one that can only contain text and does not have any attributes. It cannot contain any child elements. Syntax: <xs: element name="name" type="type"/> Examples: <xs: element name="to" type="xs: string"/> CS 561 XML & XML Query 18

XML Schema ---- example Complex type define a new type which can have attributes

XML Schema ---- example Complex type define a new type which can have attributes and can have child elements. This is very flexible. Syntax: <xs: element name="name"> <xs: complex. Type>. element content </xs: complex. Type> </xs: element> Example: <xs: element name="note"> <xs: complex. Type> <xs: sequence> <element name="to" type="xs: string"/> <element name="from" type="xs: string"/> <element name="heading" type="xs: string"/> <element name="body" type="xs: string"/> </xs: sequence> </xs: complex. Type> </xs: element> CS 561 XML & XML Query 19

XML Schema ---- features • Simple Types - 44 built-in simple types in the

XML Schema ---- features • Simple Types - 44 built-in simple types in the W 3 C XML Schema language. - Divided into seven groups: Numeric types Time types XML types String types The boolean type The URI reference type The binary types CS 561 XML & XML Query 20

XML Schema ---- features • Deriving Simple Types Not limited to the 44 simple

XML Schema ---- features • Deriving Simple Types Not limited to the 44 simple types Create new data types by deriving from the existing types - restrict a type to a subset of its normal values. eg: A schema that derives a Str 255 data type from xsd: string <xsd: simple. Type name="Str 255"> <xsd: restriction base="xsd: string"> <xsd: min. Length value="1"/> <xsd: max. Length value="255"/> </xsd: restriction> </xsd: simple. Type> CS 561 XML & XML Query 21

XML Schema ---- features • create enumerated types Example: <xsd: simple. Type name="Publisher. Type">

XML Schema ---- features • create enumerated types Example: <xsd: simple. Type name="Publisher. Type"> <xsd: restriction base="xsd: string"> <xsd: enumeration value="Warner-Elektra-Atlantic"/> <xsd: enumeration value="Universal Music Group"/> <xsd: enumeration value="Sony Music Entertainment, Inc. "/> <xsd: enumeration value="Capitol Records, Inc. "/> <xsd: enumeration value="BMG Music"/> </xsd: restriction> </xsd: simple. Type> CS 561 XML & XML Query 22

XML Schema ---- features • create new types by join existing types through a

XML Schema ---- features • create new types by join existing types through a union. Example: <xsd: simple. Type name="Money. Or. Decimal"> <xsd: union> <xsd: simple. Type> <xsd: restriction base="xsd: decimal"> </xsd: restriction> </xsd: simple. Type> <xsd: restriction base="xsd: string"> <xsd: pattern value="p{Sc}p{Nd}+(. p{Nd})? "/> </xsd: restriction> </xsd: simple. Type> </xsd: union> </xsd: simple. Type> CS 561 XML & XML Query 23

XML Schema ---- features • Namespaces - http: //www. w 3. org/2001/XMLSchema the namespace

XML Schema ---- features • Namespaces - http: //www. w 3. org/2001/XMLSchema the namespace that identifies the names of tags and attributes used in a schema. The name is understood by all schema aware XML processors. - http: //www. w 3. org/2001/XMLSchema-instance a small number of special names used in instance documents, not schema. - target namespace the set of names defined by a particular schema document the user-defined names that are to be used in the instance documents. CS 561 XML & XML Query 24

XML Schema ---- features • Grouping - Does order really mattered? ? - How?

XML Schema ---- features • Grouping - Does order really mattered? ? - How? xsd: all group ---- each element in the group must occur at most once, but that order is not important. xsd: choice group ---- any one element from the group should appear. xsd: sequence group ---- each element in the group appear exactly once, in the specified order. CS 561 XML & XML Query 25

XML Schema ---- features Example for xsd: all group <xsd: complex. Type name="Person. Type">

XML Schema ---- features Example for xsd: all group <xsd: complex. Type name="Person. Type"> <xsd: sequence> <xsd: element name="NAME"> <xsd: complex. Type> <xsd: all> <xsd: element name="GIVEN" type="xsd: string" min. Occurs="1" max. Occurs="1"/> <xsd: element name="FAMILY" type="xsd: string" min. Occurs="1" max. Occurs="1"/> </xsd: all> </xsd: complex. Type> </xsd: element> </xsd: sequence> </xsd: complex. Type> CS 561 XML & XML Query 26

XML Schema ---- features Example for XML Choice group: <xsd: complex. Type name="Song. Type">

XML Schema ---- features Example for XML Choice group: <xsd: complex. Type name="Song. Type"> <xsd: sequence> <xsd: element name="TITLE" type="xsd: string"/> <xsd: choice> <xsd: element name="COMPOSER" type="Person. Type"/> <xsd: element name="PRODUCER" type="Person. Type"/> </xsd: choice> <xsd: element name="PUBLISHER" type="xsd: string" min. Occurs="0"/> <xsd: element name="LENGTH" type="xsd: string"/> <xsd: element name="YEAR" type="xsd: string"/> <xsd: element name="ARTIST" type="xsd: string" max. Occurs="unbounded"/> <xsd: element name="PRICE" type="xsd: string" min. Occurs="0"/> </xsd: sequence> </xsd: complex. Type> CS 561 XML & XML Query 27

XML Schema ---- features • Schemas address limitations of DTDs: a strange, non-XML syntax

XML Schema ---- features • Schemas address limitations of DTDs: a strange, non-XML syntax namespace incompatibility lack of data typing limited extensibility and scalability. • XML Schemas - Powerful data typing - Range checking - Namespace-aware validation based on namespace URIs rather than on prefixes - Extensibility and scalability CS 561 XML & XML Query 28

XML Constrains ---- DTD • DTD No keys, its Referential integrity is weak Attributes

XML Constrains ---- DTD • DTD No keys, its Referential integrity is weak Attributes : ID, IDREFS. ID ---- Unique value IDREF ---- Valid ID declared in same document IDREFS ---- Valid ID, space-separated But these are also based on type string. Element: no corresponding parts. CS 561 XML & XML Query 29

XML Constrains ---- Schema • XML keys: Similar with SQL, but complicated. - complex

XML Constrains ---- Schema • XML keys: Similar with SQL, but complicated. - complex structures - a key might be composed of a sequence of values - located at different depths inside an element. Two ways: - tag unique ---- UNIQUE constraint - tag key ---- PRIMARY KEY , not null eg: <key name=“Primary. Key. For. Class”> <selector xpath=“Classes/Class”/> <field xpath=“Crs. Code”/> <field xpath=“Semester”/> </key> CS 561 XML & XML Query 30

XML Constrains ---- Schema • Foreign keys: eg: <complex. Type> …… <keyref name=“No. Bogus.

XML Constrains ---- Schema • Foreign keys: eg: <complex. Type> …… <keyref name=“No. Bogus. Transcripts” refer=“adm: Primary. Key. For. Class”> <selector xpath=“Students/Student/Crs. Taken”/> <field xpath=“@Crs. Code”/> <field xpath=“@Semester”/> </keyref> …… </complex. Type> • Powerful? CS 561 XML & XML Query 31

Question • Is XML data model relational or object-relational? • Is XML a database?

Question • Is XML data model relational or object-relational? • Is XML a database? CS 561 XML & XML Query 32

References [1] Chapter 17, XML and Web Data [2] Chapter 24, XML Bible (2

References [1] Chapter 17, XML and Web Data [2] Chapter 24, XML Bible (2 nd edition): Schemas http: //www. ibiblio. org/xml/books/bible 2/index. html#toc [3] http: //www. w 3 schools. com http: //www. w 3. org/ http: //www. xml. com/ CS 561 XML & XML Query 33

Part II n XML Query Language n CS 561 Counterpart of SQL in XML

Part II n XML Query Language n CS 561 Counterpart of SQL in XML World XML & XML Query 34

XML Query Language n n n Desired Characteristics for XML Query Language - also

XML Query Language n n n Desired Characteristics for XML Query Language - also Requirements Good candidate: XQuery Language Use Cases for XQuery Language CS 561 XML & XML Query 35

Desired Characteristics n n n n n XML Output Declarative - what has to

Desired Characteristics n n n n n XML Output Declarative - what has to be done? Query Operation No Schema Required Preserve Order and Association Mutually Embedding with XML Support for New Datatypes Suitable for Metadata Ability to add update capabilities in future versions CS 561 XML & XML Query 36

Details n XML Output n n n define derived database (virtual views) provide transparency

Details n XML Output n n n define derived database (virtual views) provide transparency to application (why? ) The XML Query Language MUST be declarative - like SQL n n CS 561 specifies what has to be done it MUST not enforce a particular evaluation strategy XML & XML Query 37

Details (cont. ) n Query Operation n n CS 561 Projection, selection, join, and

Details (cont. ) n Query Operation n n CS 561 Projection, selection, join, and restructuring should all be possible in a single XML Query (why? ) for optimization reason XML & XML Query 38

Query Operations XML QUERY Details Relational Algebra Projection Extract particular sub-elements or attributes of

Query Operations XML QUERY Details Relational Algebra Projection Extract particular sub-elements or attributes of an element Projection Select values that satisfy some predicate Selection Join values from one or more documents Join Restructuring Constructing a new set of element instances to Create view hold queried data CS 561 XML & XML Query 39

Example - Sample Data n n n n <bib> <book year="1999" isbn="1 -55860 -622

Example - Sample Data n n n n <bib> <book year="1999" isbn="1 -55860 -622 -X"> <title>Data on the Web</title> <author>Abiteboul</author> <author>Buneman</author> <author>Suciu</author> </book> <book year="2001" isbn="1 -XXXXX-YYY-Z"> <title>XML Query</title> <author>Fernandez</author> <author>Suciu</author> </book> </bib> CS 561 XML & XML Query 40

Example - XML Schema n n n n <xs: group name="Bib"> <xs: element name="bib">

Example - XML Schema n n n n <xs: group name="Bib"> <xs: element name="bib"> <xs: complex. Type> <xs: group ref="Book" min. Occurs="0" max. Occurs="unbounded"/> </xs: complex. Type> </xs: element> </xs: group> CS 561 XML & XML Query 41

Example - XML Schema (Cont. ) n n n n n <xs: group name="Book">

Example - XML Schema (Cont. ) n n n n n <xs: group name="Book"> <xs: element name="book"> <xs: complex. Type> <xs: attribute name="year" type="xs: integer"/> <xs: attribute name="isbn" type="xs: string"/> <xs: element name="title" type="xs: string"/> <xs: element name="author"type="xs: string" max. Occurs="unbounded"/> </xs: complex. Type> </xs: element> </xs: group> CS 561 XML & XML Query 42

Variable Binding LET $bib 0 : = n <bib> n <book year="1999" isbn="1 -55860

Variable Binding LET $bib 0 : = n <bib> n <book year="1999" isbn="1 -55860 -622 -X"> n <title>Data on the Web</title> n <author>Abiteboul</author> n <author>Buneman</author> n <author>Suciu</author> n </book> n <book year="2001" isbn="1 -XXXXX-YYY-Z"> n <title>XML Query</title> n <author>Fernandez</author> n <author>Suciu</author> n </book>), XML & XML Query n. CS 561 </bib> n 43

Projection n $bib 0/book/author n ==> <author>Abiteboul</author>, <author>Buneman</author>, <author>Suciu</author>, <author>Fernandez</author>, <author>Suciu</author> n n Notes:

Projection n $bib 0/book/author n ==> <author>Abiteboul</author>, <author>Buneman</author>, <author>Suciu</author>, <author>Fernandez</author>, <author>Suciu</author> n n Notes: the document order of author elements is preserved CS 561 XML & XML Query n 44

Selection n FOR $b IN $bib 0/book n WHERE $b/@year/data() <= 2000 RETURN $b

Selection n FOR $b IN $bib 0/book n WHERE $b/@year/data() <= 2000 RETURN $b n n n n ==> <book year="1999" isbn="1 -55860 -622 -X"> <title>Data on the Web</title> <author>Abiteboul</author> <author>Buneman</author> <author>Suciu</author> </book> CS 561 XML & XML Query 45

Join - Sample Data n n n LET $review 0 : = <reviews> <book>

Join - Sample Data n n n LET $review 0 : = <reviews> <book> <title>XML Query</title> <review>A darn fine book. </review> </book>, <book> <title>Data on the Web</title> <review>This is great!</review> </book> </review> : Reviews CS 561 XML & XML Query 46

Join n n n n FOR $b IN $bib 0/book, $r IN $review 0/book

Join n n n n FOR $b IN $bib 0/book, $r IN $review 0/book WHERE $b/title/data() = $r/title/data() RETURN <book>{ $b/title, $b/author, $r/review }</book> ==> <book> <title>Data on the Web</title> <author>Abiteboul</author> <author>Buneman</author> <author>Suciu</author> <review>A darn fine book. </review> </book>, <book> <title>XML Query</title> <author>Fernandez</author> <author>Suciu</author> <review>This is great!</review> </book> CS 561 XML & XML Query 47

Restructuring n n n n FOR $a IN distinct-value($bib 0/book/author/data()) RETURN <biblio> <author>{ $a

Restructuring n n n n FOR $a IN distinct-value($bib 0/book/author/data()) RETURN <biblio> <author>{ $a }</author> { FOR $b IN $bib 0/book, $a 2 IN $b/author/data() WHERE $a = $a 2 RETURN $b/title } </biblio> CS 561 XML & XML Query 48

Restructuring (Cont. ) n ==> <biblio> <author>Abiteboul</author> n <title>Data on the Web</title> n </biblio>,

Restructuring (Cont. ) n ==> <biblio> <author>Abiteboul</author> n <title>Data on the Web</title> n </biblio>, n <biblio> n <author>Buneman</author> n <title>Data on the Web</title> n </biblio>, n <biblio> n <author>Suciu</author> n <title>Data on the Web</title> n <title>XML Query</title> n </biblio>, n <biblio> n <author>Fernandez</author> n <title>XML Query</title> XML & XML Query n. CS 561 </biblio> n 49

Details (cont. ) n No Schema Required n n XML Query should be usable

Details (cont. ) n No Schema Required n n XML Query should be usable on XML data when there is no schema (DTD or XML Schema) known in advance. But it should be able to exploit the schema if the schema is available. Preserve Order and Association n CS 561 XML Query should preserve order and association of elements in XML data (why? ) XML & XML Query 50

Details (cont. ) n Mutually Embedding with XML n n An XML Query should

Details (cont. ) n Mutually Embedding with XML n n An XML Query should be able to contain arbitrary XML data, and an XML document should be able to hold arbitrary XML Queries Support for New Datatypes n CS 561 XML Query should have an extension mechanism for conditions and operations specific to a particular datatypes (e. g. multimedia data). XML & XML Query 51

Details (cont. ) n Suitable for Metadata n n n XML Query should be

Details (cont. ) n Suitable for Metadata n n n XML Query should be useful as a part of metadata descriptions (how? ) Question: how about metadata in relational database? The current version MUST not preclude the ability to add update capabilities in future versions CS 561 XML & XML Query 52

Your Idea? n Any other characteristics you desire? CS 561 XML & XML Query

Your Idea? n Any other characteristics you desire? CS 561 XML & XML Query 53

XQuery Language n n Overview XPath XQuery 1. 0 Semantics Future work for XQuery

XQuery Language n n Overview XPath XQuery 1. 0 Semantics Future work for XQuery CS 561 XML & XML Query 54

Overview n Combine the best features of XPath, SQL and ideas borrowed from object

Overview n Combine the best features of XPath, SQL and ideas borrowed from object query language. CS 561 XML & XML Query 55

XPath n n Language for navigation with treestructured documents XPath data model n n

XPath n n Language for navigation with treestructured documents XPath data model n n n CS 561 XML document Tree Element Attribute Node text comment XML & XML Query 56

Navigation in XPath n Operators n n n n Root: / Parent: . .

Navigation in XPath n Operators n n n n Root: / Parent: . . Child (descendant): / or // Attribute value: @ Comment: comment() function Text: text() function Element: <element name> Wildcards n *: all e-children of a node irrespective of type, not including text nodes @*: all attributes //: all descendants of current node CS 561 XML & XML Query n n 57

XPath expression n n Combination of XPath operators Input: a document tree Output: a

XPath expression n n Combination of XPath operators Input: a document tree Output: a set of nodes Absolute path expression n n start from the root node Relative path expression n CS 561 start from the current node XML & XML Query 58

XPath query n n n Selection conditions Built-in functions Aggregate functions CS 561 XML

XPath query n n n Selection conditions Built-in functions Aggregate functions CS 561 XML & XML Query 59

Example – XML file n n n n <students> <student studid=“ 996341111”> <name><first>John</first><last>Doe</last></name> <status>U

Example – XML file n n n n <students> <student studid=“ 996341111”> <name><first>John</first><last>Doe</last></name> <status>U 2</status> <crstaken crscode=“CS 503” semester=“S 2002”/> <crstaken crscode=“CS 561” semester=“S 2002”/> </student> <student studid=“ 996342222”> <name><first>Bart</first><last>Simpson</last></name> <status>U 4</status> <crstaken crscode=“CS 504” semester=“S 2002”/> </students> CS 561 XML & XML Query 60

XPath Document Tree root comment studen t nam e firs t John CS 561

XPath Document Tree root comment studen t nam e firs t John CS 561 comment student s studid studen t status last Doe crstak en U 2 crscode crstak en semester XML & XML Query crscode semester 61

Example – XPath Query n //student[status=“U 2” and start-with(. //last, “D”) and not (.

Example – XPath Query n //student[status=“U 2” and start-with(. //last, “D”) and not (. //last=. //first)] n //student[count(crstaken) > =5] n //student[crstaken/@crscode=“CS 561”] [crstaken/@semester=“S 2002”] n CS 561 XML & XML Query 62

Why XPath is not satisfying? n Just for navigating, can only support limited queries

Why XPath is not satisfying? n Just for navigating, can only support limited queries n n n CS 561 Cannot express join Cannot work on multiple XML documents Cannot filter unwanted elements Not support user-defined functions Not support importation and use of the types defined in various XML schemas Any other limitations you can think of? XML & XML Query 63

A better candidate for XML Query? n n n XQuery Language incorporates all the

A better candidate for XML Query? n n n XQuery Language incorporates all the above characteristics Any other characteristics you can think of? XQuery engine: Kweelt n CS 561 http: //kweelt. sourceforge. net/ XML & XML Query 64

XQuery expressions n n n n – Path expressions – FLWR expressions – Element

XQuery expressions n n n n – Path expressions – FLWR expressions – Element constructors – Expressions involving operators and functions – Conditional expressions – Quantified expressions – List constructors – Expressions that test or modify datatypes CS 561 XML & XML Query 65

XQuery FLWR Expressions n A FLWR expression binds some expressions, applies a predicate, and

XQuery FLWR Expressions n A FLWR expression binds some expressions, applies a predicate, and constructs a new result. n n FOR var IN expr …. LET var : = expr …. WHERE expr …. RETURN expr …. CS 561 FOR and LET clauses grnerate a list of tuples of bound exprs, preserving document order WHERE clause applies a predicate, eliminating some of the tuples RETURN clause is executing for each surviving tuple, generating an ordered list of outputs XML & XML Query 66

Example - DTD n n n <!ELEMENT reviews (entry*)> <!ELEMENT entry (title, price, review)>

Example - DTD n n n <!ELEMENT reviews (entry*)> <!ELEMENT entry (title, price, review)> <!ELEMENT title (#PCDATA)> <!ELEMENT price (#PCDATA)> <!ELEMENT review (#PCDATA)> CS 561 XML & XML Query 67

Example – Sample Data n n n n n http: //www. amazon. com/reviews. xml

Example – Sample Data n n n n n http: //www. amazon. com/reviews. xml <reviews> <entry> <title>Data on the Web</title> <price>34. 95</price> <review> a good discussion of database systems and XML. </review> </entry> <title>Advanced Unix Programming </title> <price>65. 95</price> <review> a good discussion of UNIX programming. </review> </entry> </reviews> CS 561 XML & XML Query 68

Example - Request n For each book found at both www. bn. com and

Example - Request n For each book found at both www. bn. com and www. amazon. com, list the title of the book and its price from each source CS 561 XML & XML Query 69

Example - Query n n n n <books-with-prices> { for $b in document("www. bn.

Example - Query n n n n <books-with-prices> { for $b in document("www. bn. com/bib. xml")//book, $a in document("www. amazon. com/reviews. xml")//entry where $b/title = $a/title return <book-with-prices> { $b/title } <price-amazon>{ $a/price/data() }</price-amazon> <price-bn>{ $b/price/data() }</price-bn> </book-with-prices> } </books-with-prices> CS 561 XML & XML Query 70

Example - Result n n n <books-with-prices> <book-with-prices> <title>Advanced Unix Programming</title> <price-amazon>65. 95</price-amazon> <price-bn>65.

Example - Result n n n <books-with-prices> <book-with-prices> <title>Advanced Unix Programming</title> <price-amazon>65. 95</price-amazon> <price-bn>65. 95</price-bn> </book-with-prices> <title>Data on the Web</title> <price-amazon>34. 95</price-amazon> <price-bn> 39. 95</price-bn> </book-with-prices> </books-with-prices> CS 561 XML & XML Query 71

Use Cases n n Use Case 1: Queries that reserve hierarchy Use Case 2:

Use Cases n n Use Case 1: Queries that reserve hierarchy Use Case 2: Access to relational data CS 561 XML & XML Query 72

Use Case 1: Queries that reserve hierarchy n XML document has flexible structure n

Use Case 1: Queries that reserve hierarchy n XML document has flexible structure n n Text is mixed with elements Many elements are optional Wide variation in structure from one document to another The ways in which elements are ordered and nested are quite important (Can you give me an example? ) CS 561 XML & XML Query 73

Use Case 1 - DTD n <!DOCTYPE book [ n n n <!ELEMENT book

Use Case 1 - DTD n <!DOCTYPE book [ n n n <!ELEMENT book (title, author+, section+)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT section (title, (p | figure | section)* )> <!ATTLIST section n n <!ELEMENT p (#PCDATA)> <!ELEMENT figure (title, image)> <!ATTLIST figure n n width CDATA #REQUIRED height CDATA #REQUIRED > <!ELEMENT image EMPTY> <!ATTLIST image n n id ID #IMPLIED difficulty CDATA #IMPLIED> source CDATA #REQUIRED > ]> CS 561 XML & XML Query 74

n <book> n n <title>Data on the Web</title> <author>Serge Abiteboul</author> <author>Peter Buneman</author> <section id="intro"

n <book> n n <title>Data on the Web</title> <author>Serge Abiteboul</author> <author>Peter Buneman</author> <section id="intro" difficulty="easy" > n n n <title>Introduction</title> <p>Text. . . </p> <section> n n </section> <section> n n n n <title>Web Data and the Two Cultures</title> <p>Text. . . </p> <figure height="400" width="400"> n <title>Traditional client/server architecture</title> n <image source="csarch. gif"/> </figure> <p>Text. . . </p> </section> <section id="syntax" difficulty="medium" > n n <title>Audience</title> <p>Text. . . </p> . . . n </section> CS 561 </book> XML & XML Query 75

Use Case 1 - Request n n List all the sections and their titles.

Use Case 1 - Request n n List all the sections and their titles. Preserve the original attributes of each <section> element, if any. Questions n n n CS 561 Do we need all the elements? How could we eliminate unwanted elements? How could we preserve the original attributes? XML & XML Query 76

Use Case 1 - Solution n <toc> n { n Let $b : =

Use Case 1 - Solution n <toc> n { n Let $b : = document(“book 1. xml”) n Return n Filter($b//section | $b//section/title/data()) } </toc> n n CS 561 XML & XML Query 77

Use Case 1 - Result n <toc> n <section id="intro" difficulty="easy"> n n <title>Introduction</title>

Use Case 1 - Result n <toc> n <section id="intro" difficulty="easy"> n n <title>Introduction</title> <section> n n n </section> <section> n n n <title>Web Data and the Two Cultures</title> </section> <section id="syntax" difficulty="medium"> n n <title>Audience</title> . . . </section> </toc> CS 561 XML & XML Query 78

Use Case 2 - Access to Relational Data n Questions n n n CS

Use Case 2 - Access to Relational Data n Questions n n n CS 561 How to represent relational tables as XML document? Do we need multiple XML documents? How does XQuery work on multiple XML documents? XML & XML Query 79

Use Case 2 - Access to Relational Data n Represent database table as XML

Use Case 2 - Access to Relational Data n Represent database table as XML document n n n Document element <-> table Tuple <-> nested element Column <-> nested element inside tupleelement n Column that allow null values are represented by optional elements, and a missing element denotes a null value CS 561 XML & XML Query 80

Use Cases 2 - Online Auction n Tables n USERS (userid, name, rating) n

Use Cases 2 - Online Auction n Tables n USERS (userid, name, rating) n Contains n ITEMS (itemno, description, offered_by, start_date, end_date, reserve_price) n Lists n items currently or recently for sale BIDS (userid, itemno, bid_date) n Contains CS 561 info on registered users all bids on record XML & XML Query 81

Simplified E-R Diagram userid itemno ITEMS USERS BIDS userid CS 561 itemno XML &

Simplified E-R Diagram userid itemno ITEMS USERS BIDS userid CS 561 itemno XML & XML Query 82

Use Case 2 - DTD n n n n <!DOCTYPE users [ <!ELEMENT users

Use Case 2 - DTD n n n n <!DOCTYPE users [ <!ELEMENT users (user_tuple*)> <!ELEMENT user_tuple (userid, name, rating? )> <!ELEMENT userid (#PCDATA)> <!ELEMENT name (#PCDATA)> <!ELEMENT rating (#PCDATA)> ]> CS 561 XML & XML Query 83

Use Case 2 - DTD n n n n n <!DOCTYPE items [ <!ELEMENT

Use Case 2 - DTD n n n n n <!DOCTYPE items [ <!ELEMENT items (item_tuple*)> <!ELEMENT item_tuple (itemno, description, offered_by, start_date? , end_date? , reserve_price? )> <!ELEMENT itemno (#PCDATA)> <!ELEMENT description (#PCDATA)> <!ELEMENT offered_by (#PCDATA)> <!ELEMENT start_date (#PCDATA)> <!ELEMENT end_date (#PCDATA)> <!ELEMENT reserve_price (#PCDATA)> ]> CS 561 XML & XML Query 84

Use Case 2 - DTD n n <!DOCTYPE bids [ <!ELEMENT bids (bid_tuple*)> n

Use Case 2 - DTD n n <!DOCTYPE bids [ <!ELEMENT bids (bid_tuple*)> n <!ELEMENT bid_tuple (userid, itemno, bid_date)> <!ELEMENT userid (#PCDATA)> <!ELEMENT itemno (#PCDATA)> <!ELEMENT bid_date (#PCDATA)> n ]> n n CS 561 XML & XML Query 85

Use Case 2 - Sample Data n USERID NAME RATING U 01 Tom Jones

Use Case 2 - Sample Data n USERID NAME RATING U 01 Tom Jones B U 02 Mary Doe A U 04 Roger Smith C U 05 Rip Sprat CS 561 B XML & XML Query 86

Use Case 2 - Sample Data n ITEMNO ITEMS DESCRIPTION OFFERED_BY START_DAT E 1001

Use Case 2 - Sample Data n ITEMNO ITEMS DESCRIPTION OFFERED_BY START_DAT E 1001 Red Bicycle U 01 01 -01 -05 01 -01 -20 40 1002 Motorcycle U 02 01 -02 -11 01 -03 -15 500 1003 Old Bicycle 01 -01 -10 01 -02 -20 25 CS 561 U 02 XML & XML Query END_DATE RESERVE_ PRICE 87

Use Case 2 – Sample Data n BIDS USERID U 02 U 04 U

Use Case 2 – Sample Data n BIDS USERID U 02 U 04 U 01 U 02 U 04 U 05 CS 561 ITEMNO 1001 BID 35 40 45 BID_DATE 01 -01 -07 01 -01 -08 01 -01 -11 1002 1003 55 400 600 1000 1200 15 20 01 -01 -15 01 -02 -14 01 -02 -16 01 -02 -25 01 -03 -02 01 -01 -22 01 -02 -03 XML & XML Query 88

Use Case 2 - Request n CS 561 For all bicycles, list the item

Use Case 2 - Request n CS 561 For all bicycles, list the item number, description, and highest bid (if any), ordered by item number. XML & XML Query 89

Use Case 2 – Solution n n n CS 561 <result> { for $i

Use Case 2 – Solution n n n CS 561 <result> { for $i in document("items. xml")//item_tuple let $b : = document("bids. xml")//bid_tuple[itemno = $i/itemno] where contains($i/description, "Bicycle") return <item_tuple> { $i/itemno } { $i/description } <high_bid>{ max($b/bid) }</high_bid> </item_tuple> sortby(itemno) } XML & XML Query 90

Use Case 2 – Result (Bingo!) n n n n <result> <item_tuple> <itemno>1001</itemno> <description>Red

Use Case 2 – Result (Bingo!) n n n n <result> <item_tuple> <itemno>1001</itemno> <description>Red Bicycle</description> <high_bid> <bid>55</bid> </high_bid> </item_tuple> <itemno>1003</itemno> <description>Old Bicycle</description> <high_bid> <bid>20</bid> </high_bid> </item_tuple> </result> CS 561 XML & XML Query 91

Future Work about XQuery n Add support for new desired characteristics n n What

Future Work about XQuery n Add support for new desired characteristics n n What are they? Any other future work? CS 561 XML & XML Query 92

Bibliography n n Chapter 17, XML and Web Data XML Query Requirements n n

Bibliography n n Chapter 17, XML and Web Data XML Query Requirements n n XML Query Use Cases, W 3 C Working Draft 20 December 2001 n n http: //www. w 3. org/TR/2001/WD-xmlquery-req-20010215 http: //www. w 3. org/TR/2001/WD-xmlquery-use-cases 20011220 Database Desiderata for an XML Query Language, David Maier, Oregon Graduate Institute CS 561 XML & XML Query 93