XMLQL A Query Language for XML Charuta Nakhe

















- Slides: 17
XML-QL A Query Language for XML Charuta Nakhe charuta@cse. iitb. ernet. in
Querying XML document v What is a query language? v Why not adapt SQL or OQL to query XML data? v What is an XML query? • • • What is the database? -- XML documents What is input to the query? – XML document What is the output of the query? – XML document
Requirements of XML query language v Query operations : Selection: eg. Find books with “S. Sudarshan” as author v Extraction: eg. Extract the publisher field of above books v Restructuring : Restructuring of elements v Combination : Queries over more than one documents v Must be able to transform & create XML structures v Capability for querying even in absence of schema v
The XML-QL language v The XML-QL language is designed with the following features: • • v it is declarative, like SQL. it is relational complete, e. g. it can express joins. it can be implemented with known database techniques. it can extract data from existing XML documents and construct new XML documents. XML-QL is implemented as a prototype and is freely available in a Java version.
Example XML document <bib> <book year=“ 1997”> <title>Inside COM</title> <author>Dale Rogerson</author> <publisher><name>Microsoft</name</publisher> </book> <book year=“ 1998”> <title>Database system concepts</title> <author>S. Sudarshan</author> <author>H. Korth</author> <publisher>
Matching data using patterns • • Find those authors who have published books for Mc. Graw Hill: WHERE <bib><book> <publisher><name>Mc. Graw Hill</></> <title>$t</> <author>$a</> </book></bib> IN “bib. xml” CONSTRUCT <result><title>$t</><author>$a</></> the $t and $a are variables that pick out contents. the output is a collection of author names.
Result XML document <result> <title>Database system concepts</title> <author>S. Sudarshan</author> </result> <title>Database system concepts</title> <author>H. Korth</author> </result>
Grouping with Nested Queries • Group results by book title : WHERE <bib. book>$p</> IN “bib. xml”, <title>$t</> <publisher><name>Mc. Graw Hill</></> IN $p CONSTRUCT <result> <title>$t</> WHERE <author>$a</> IN $p CONSTRUCT $a </> Produces one result for each title and contains a list of all its authors
Result XML document <result> <title>Database system concepts</title> <author>S. Sudarshan</author> <author>H. Korth</author> </result>. .
Constructing XML data v • • Results of a query can be wrapped in XML: WHERE <bib. book> <publisher><name>Mc. Graw. Hill</></> <title>$t</> <author>$a</> IN “bib. xml” CONSTRUCT <result><author>$a</><title>$t</></> Results are grouped in elements. The pattern matches once for each author, which may give duplicates of books.
Joining elements by value • Find all articles that have at least one author who has also written a book since 1995 : WHERE <bib. article> <author>$n</> I </> CONTENT_AS $a IN “bib. xml”, <book year=$y> <author>$n</> IN “bib. xml”, y > 1995 CONSTRUCT <article>$a</> CONTENT_AS $a following a pattern binds the content of the matching element to the variable $a
Tag variables • • Find all publications in 1995 where Smith is either an author or editor : WHERE <bib. $p> <title>$t</> <year>1995</> <$e>Smith</> IN “bib. xml”, $e IN {author, editor} CONSTRUCT <$p><title>$t</><$e>Smith</></> $p matches book and article. $e matches author and editor.
Regular-path expressions • Find the name of every part element that contains a brand element equal to “Ford”, regardless of the nesting level at which r occurs. WHERE <part*> <title>$r</> <name>Ford</> IN “bib. xml” CONSTRUCT <result>$r</> Regular path expressions can specify element paths of arbitrary depth
Other interesting features v Constructing explicit root element v Grouping of data v Transforming XML data v Integrating data from different XML sources
Links for more information www. w 3. org/TR/NOTE-xml-ql : The XML-QL W 3 C Note v www. research. att. com/~mff/xmlql/doc : The XML-QL home page v www. w 3. org/XML/Activity. html#query-wg : The XML Query Working Group v www. w 3. org/TR/xmlquery-req : XML Query Requirements (W 3 C Working Draft) v www. oasis-open. org/cover/xml. Query. html : Robin Cover's page on XML query languages v
Example DTD <!ELEMENT book (author+, title, publisher)> <!ATTLIST book year CDATA> <!ELEMENT article (author+, title, year? )> <!ATTLIST article type CDATA> <!ELEMENT publisher (name, address)> <!ELEMENT author (firstname? , lastname)>
Creating an explicit root element v Every XML document must have a single root. XML-QL supplies an <XML> element as default, but others may be specified: CONSTRUCT <results> { WHERE <bib. book> <publisher><name>Mc. Graw. Hill</></> <title> $t </> <author> $a </> IN “bib. xml” CONSTRUCT <result><author>$a</><title>$t</></> } </results>