XPath Laks V S Lakshmanan UBC CPSC 534
XPath Laks V. S. Lakshmanan UBC CPSC 534 B
Overview • • data model recap XPath examples some advanced features summary
XPath in the beginning • used to be part of XSLT – A formal semantics of patterns in XSLT (Phil Wadler) • also influenced XLink, XPointer, • resources: – www. w 3 c. org/TR/xpath – Galax (complete reference impl. of XQuery) http: //db. bell-labs. com/galax/ • (w 3 c. org – major resource many XML and other web related stuff, incl. XQuery, semantic web, etc. )
Example XML DB <bib> <book price=15> <title> What is the name of this book? </title> <author nationality=american> <first> Raymond</first> <last>Smullyan</last> </author> <publisher>Penguin</publisher> <year>1970</year> </book> <book><publisher><name>Bentam Books</name><address>New York</address></publisher> <author><first>Douglas</first><mi>R</mi><last>Hofstadter</last></author> <author>D. C. Dennett</author> <title>The Mind’s I: Reflections on Self and Soul</title> <year>1981</year> </book> </bib>
Corresponding Tree root note distinction between the two roots. comments processing instructions book price title ooo author root doc. element bib book ooo publisher attribute unordered usually, single valued. Exception: IDREFS. element ordered. no apriori cardinality constraint.
Simple Examples • /bib/book/publisher • answer: <publisher>Penguin</publisher> <publisher><name>Bentam Books</name> <address>New York</address></publisher> • • /bib/book/author/name what’s the answer? / -- returns root element, while /bib -- returns doc. root element, i. e. , the bib element under the root.
Descendants • • • /bib/book//address answer: <address>New York</address></publisher> /bib/book//mi answer: <mi>R</mi> //title answer: <title> What is the name of this book? </title> <title>The Mind’s I: Reflections on Self and Soul</title> Note: results ordered as per i/p document order.
Wildcard • //author/* • answer: <first> Raymond</first> <last>Smullyan</last> <first>Douglas</first><mi>R</mi><last>Hofstadter</last> why only two authors(’ info. ) returned? Note: * matches any element. • what does //* return? • is the answer identical to that for /bib?
Attributes • XML data model – diff. kinds of nodes: element, attribute, text, comment, processing instruction, . . . • /bib/book/@price • answer: ``15” contrast with answer for previous queries. • /bib/book/@* what do you think it should return?
Branching & Qualifiers/Predicates • • /bib/book/author[mi] returns only second book. /bib/book[author/@nationality=american] returns only first book. • /bib/book[publisher[address][name]][price<20]//title • returns the titles of books with a publisher who has a name & an address and with a price < 20.
Reaching out at other nodes • XPath has the functions text(), node(), name(). Meanings illustrated below. • /bib/book/publisher/text() • answer: Penguin – why first pub doesn’t appear? • /bib/book/node() • returns all child nodes of book, regardless of type (attr, text, element). • /bib/*/name() – returns tag of current element.
Mixing it all • /bib/book[author[hobby=tennis]][title/text( )]//year • what does it say? • Features of XPath seen so far tree pattern query. $x $y $x. tag=bib & $y. tag=comment & $z. tag=publisher. . . $z $z distinguished node $w
XPath – Summary • • • / -- matches the root. /bib – matches bib element under root. bib – matches any bib element. * -- matches any element. bib/book – matches any book element under a bib element. bib//book – ditto, but at any depth. //book – matches any book element at any depth in the document. author|editor – matches any author or editor element. @hobby – matches any hobby attribute. //author/@hobby -- matches any price attribute of an author at any depth of the doc. /bib/book[author[@hobby]][@price<20]//publisher – what does it match?
XPath – The 13 axes • • • • child descendant attribute descendant-or-self following-sibling ancestor-or-self parent preceding-sibling self namespace
Some Abbreviations child: : book/child: : author book/author child: : book/descendant: : mi book//mi child: : first/parent: : * first/. . child: : book/attribute: : price book/@price child: : book/child: : author/parent: : */child: : y ear book[author]/year • /bib//mi[ancestor: : book] ? • /bib//mi/ancestor: : book//publisher ? • /bib//mi/ancestor: : *//publisher ? • • •
More examples • /bib/descendant: : *[name()=address] /bib//address • /bib//book//first/parent: : *[name()=author] • /bib//book//mi[ancestor: *[name()=author or name()=editor]] • navigation axes increase expressive power • BUT, when schema is known, can often simplify XPath expressions
Simplifying XPE with schema bib example schema graph S * book * ? + 1 ? editor author title publisher year 1 ? 1 first mi last name address • /bib//book//first/parent: : *[name()=author] /bib//book//author[first] • /bib//book//mi[ancestor: *[name()=author or name()=editor] S /bib//book//*[name()=author or name()]//mi
XPath, formally speaking • XPE a binary relation over document nodes: p(context node, answer node). • basic cases: – “. ” is, i. e. , self (x, x) – “. . ” is parent, i. e. , (current node, its parent) – publisher/address is (current node, address node reachable from current node via publisher child) – book/*/mi/. . /name() book/*[mi]/name() what is the relationship captured by this XPE?
XPath, formally speaking • Relative path expressions – every XPE E we have seen so far, except E may be used as a predicate or as an extension to an absolute XPE. • Absolute XPE – how you get a (unary) query out of an XPE. • E. g. : author/mi and publisher/address are relative XPEs. //author/mi, /bib/book[author/mi][publisher/address]//year are all absolute XPEs. • More details: see resources and stay tuned for homework.
- Slides: 19