Introduction to Xpath Sources XML Path Language XPath

  • Slides: 90
Download presentation
Introduction to Xpath

Introduction to Xpath

Sources • XML Path Language (XPath) Version 1. 0, http: //www. w 3. org/TR/xpath

Sources • XML Path Language (XPath) Version 1. 0, http: //www. w 3. org/TR/xpath • http: //www. w 3 schools. com/xpath_examples. asp • Essential XML Quick Reference (A. Skonnard and M. Gudgin) • http: //www. w 3 schools. com/xpath • XML In a Nutshell, O’Reilly, Harold & Means

Xpath 1. 0 • • Examples Data Model Syntax Location paths Expressions Functions Data

Xpath 1. 0 • • Examples Data Model Syntax Location paths Expressions Functions Data Model for Xpath 2. 0 and Xquery 1. 0

Xpath Examples A CD catalog with entries such as: <cd> <title>Hide your heart</title> <artist>Bonnie

Xpath Examples A CD catalog with entries such as: <cd> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <country>UK</country> <company>CBS Records</company> <price>9. 90</price> <year>1988</year> </cd>

/catalog/cd : selects all the cd nodes <cd> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <country>USA</country> <company>Columbia</company>

/catalog/cd : selects all the cd nodes <cd> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <country>USA</country> <company>Columbia</company> <price>10. 90</price> <year>1985</year> </cd> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <country>UK</country> <company>CBS Records</company> <price>9. 90</price> <year>1988</year> </cd> ……. File at www. cs. technion. ac. il/~oshmu/cd. xml

/catalog/cd[1] : <cd> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <country>USA</country> <company>Columbia</company> <price>10. 90</price> <year>1985</year> </cd> selects

/catalog/cd[1] : <cd> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <country>USA</country> <company>Columbia</company> <price>10. 90</price> <year>1985</year> </cd> selects the first cd node

/catalog/cd/price : <price>10. 90</price> <price>9. 90</price> <price>10. 20</price> <price>9. 90</price> <price>10. 90</price> <price>8. 10</price>

/catalog/cd/price : <price>10. 90</price> <price>9. 90</price> <price>10. 20</price> <price>9. 90</price> <price>10. 90</price> <price>8. 10</price> <price>8. 50</price> <price>10. 80</price> <price>8. 70</price> <price>10. 90</price> <price>10. 20</price> <price>8. 70</price> selects price nodes <price>9. 90</price> <price>8. 20</price> <price>7. 90</price> <price>8. 90</price> <price>7. 80</price> <price>9. 90</price> <price>7. 20</price> <price>7. 80</price> <price>8. 20</price>

/catalog/cd/price/text(): nodes 10. 90 9. 90 10. 20 9. 90 10. 90 8. 10

/catalog/cd/price/text(): nodes 10. 90 9. 90 10. 20 9. 90 10. 90 8. 10 8. 50 10. 80 8. 70 10. 90 10. 20 8. 70 9. 90 8. 20 7. 90 8. 90 7. 80 9. 90 7. 20 7. 80 8. 20 selects price text

/catalog/cd [price < 7. 80] : selects cd nodes whose price text value is

/catalog/cd [price < 7. 80] : selects cd nodes whose price text value is less than 7. 80 <cd> <title>Picture book</title> <artist>Simply Red</artist> <country>EU</country> <company>Elektra</company> <price>7. 20</price> <year>1985</year> </cd>

/catalog/cd [price < 7. 80]/ price : selects price nodes whose text value is

/catalog/cd [price < 7. 80]/ price : selects price nodes whose text value is less than 7. 80 <price>7. 20</price>

/catalog/cd [price < 7. 80]/ price/text() : selects text nodes within price nodes whose

/catalog/cd [price < 7. 80]/ price/text() : selects text nodes within price nodes whose text value is less than 7. 80 7. 20

Semantics of Location Paths • A relative location path consists of a sequence of

Semantics of Location Paths • A relative location path consists of a sequence of one or more location steps separated by /. • The steps in a relative location path are composed together from left to right. • Each step in turn selects a set of nodes relative to a context node. • An initial sequence of steps is composed together with a following step as follows. – The initial sequence of steps selects a set of nodes relative to a context node. – Each node in that set is used as a context node for the following step. – The sets of nodes identified by that step are unioned together. The set of nodes identified by the composition of the steps is this union.

Semantics of Location Paths • An absolute location path consists of / optionally followed

Semantics of Location Paths • An absolute location path consists of / optionally followed by a relative location path. • A / by itself selects the root node of the document containing the context node. • If it is followed by a relative location path, then the location path selects: – the set of nodes that would be selected by the relative location path relative to the root node of the document containing the context node.

Concepts

Concepts

Data Model • A formalism for internal representation of an XML document. • Later

Data Model • A formalism for internal representation of an XML document. • Later on we’ll see the more general data model for Xpath 2. 0 and Xquery 1. 0. • Applies after resolving entities and CDATA sections. • The data model instance, conceptually, is the object to which Xpath queries are applied. • The data model is a tree made out of nodes and edges between them.

Kinds of Nodes: Root • Unique root of the tree – Has comment children

Kinds of Nodes: Root • Unique root of the tree – Has comment children nodes, one per comment – Has processing instruction nodes, one per PI – One child node for the document element – No information regarding: XML declaration, DTD, whitespace before or after the document element – Has no parent node – Its value is that of the document element

Kinds of Nodes: Element • Represents an element in the document – – Has

Kinds of Nodes: Element • Represents an element in the document – – Has a namespace URI Has a parent node and a list of child nodes Children may be: other element nodes, comment nodes, PI nodes, text nodes – Has a list of attributes – Not considered as children – Has a list of namespaces - Not considered as children – Value = text, after entities are resolved, appearing between the start and end tags of the element, after PIs, comments and tags are removed.

Kinds of Nodes: Attribute • Represents an attribute in the document – Has a

Kinds of Nodes: Attribute • Represents an attribute in the document – Has a name – Value = normalized attribute value – Has a namespace URI – Has a value – Has a parent node and NO child nodes – Is NOT considered a child of its parent – xlmns and xlmns: prefix attributes are NOT represented as attribute nodes

Kinds of Nodes: Text • Represents max contiguous text between tags, PIs and comments

Kinds of Nodes: Text • Represents max contiguous text between tags, PIs and comments – Has a parent node – Has no child nodes – Value = text in node

Kinds of Nodes: Namespace • Represents a namespace in whose scope the element lies

Kinds of Nodes: Namespace • Represents a namespace in whose scope the element lies – Has name = prefix – Has value = namespace URI – One xlmns or xlmns: prefix declaration may give rise to MULTIPLE namespace nodes – Has a parent node – Is NOT considered a child of its parent

Kinds of Nodes: PI • Represents a PI – Has a target – Has

Kinds of Nodes: PI • Represents a PI – Has a target – Has name = target – Has data – Has value = data minus initial whitespace – Has a parent node – Has no children – IS considered a child of its parent

Kinds of Nodes: Comment • Represents a comment – Has a target – Has

Kinds of Nodes: Comment • Represents a comment – Has a target – Has a parent node – Has value = string content of comment without <! - - and - -> – Has no children – IS considered a child of its parent

Xpath Data Types • • • Boolean: true, false Number: floating point String: sequence

Xpath Data Types • • • Boolean: true, false Number: floating point String: sequence of characters Node-sets: node collection, no duplicates Document order: the order in which starttags appear (DFS on the tree)

Example File <? xml version="1. 0" encoding="ISO-8859 -1"? > <catalog> <cd country="USA"> <title>Empire Burlesque</title>

Example File <? xml version="1. 0" encoding="ISO-8859 -1"? > <catalog> <cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10. 90</price> </cd> <cd country="UK"> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <musthave> yes</musthave> <price>9. 90</price> </cd> <cd country="USA"> <title>Greatest Hits</title> <artist>Dolly Parton</artist> <scratch> yes</scratch> <price>9. 90</price> </cd> </catalog>

Navigation • Generally the syntax is either of the form /location step/…/location step or

Navigation • Generally the syntax is either of the form /location step/…/location step or location step/…/location step • If path starts with / then it matches the root document node (absolute path) • Otherwise, it is a relative path that matches the context node • With each step there is an associated set of context nodes • For each node in this set the next step is evaluated • The union of the resulting sets forms the next context set (how is this union done? )

Navigation Example • Select all the price elements of all the cd elements of

Navigation Example • Select all the price elements of all the cd elements of the catalog element: • /catalog/cd/price – <price>10. 90</price> – <price>9. 90</price> • In fact, in its full version this is: • /child: : catalog/child: : cd/child: : price

Navigation – Union • Select all the price or title elements of all the

Navigation – Union • Select all the price or title elements of all the cd elements of the catalog element: • /catalog/cd/price | /catalog/cd/title – – – <title>Empire Burlesque</title> <price>10. 90</price> </cd> <title>Hide your heart</title> <price>9. 90</price> </cd> <title>Greatest Hits</title><price>9. 90</price> </cd> • The result order is document order (always, for union) • cd/price identifies the child price elements of the context node’s cd child

Axes • A location step is of the form – axis: : nodetest [

Axes • A location step is of the form – axis: : nodetest [ ] … [ ] where each [ ] denotes a predicate, zero or more predicates • Axis can be: self, child, parent, descendant, descendant-or-self, ancestor-or-self, following, following-sibling, preceding -sibling, attribute, namespace • An axis is either forward (e. g. , descendant) or backward (e. g. , ancestor)

Axes (Cont. ) • Each axis has a principal node type • When identifying

Axes (Cont. ) • Each axis has a principal node type • When identifying nodes via * or via name, only nodes of the principal type are candidates • The attribute axis has principal node type of Attribute • The namespace axis has principal node type of Namespace • All other axes have principal node type of Element

self • • Identifies the context node /catalog/cd/self: : cd Same as: /catalog/cd

self • • Identifies the context node /catalog/cd/self: : cd Same as: /catalog/cd

child • • Identifies the child nodes of the context node Default axis /catalog/cd

child • • Identifies the child nodes of the context node Default axis /catalog/cd Same as: /catalog/child: : cd Same as: /child: : catalog/child: : cd

parent • Identifies the parent node of the context node • /catalog/cd/parent: : catalog

parent • Identifies the parent node of the context node • /catalog/cd/parent: : catalog • Same as: • /catalog/cd/parent: : catalog

descendant and descendant-or-self • Identifies the descendant nodes of the context node • /catalog/descendant:

descendant and descendant-or-self • Identifies the descendant nodes of the context node • /catalog/descendant: : title – <title>Empire Burlesque</title> – <title>Hide your heart</title> – <title>Greatest Hits</title> • /catalog/descendant-or-self: : title returns the same result • /catalog/descendant-or-self: : catalog returns the catalog node

ancestor and ancestor-or-self • Identifies the ancestor nodes of the context node • /catalog/descendant:

ancestor and ancestor-or-self • Identifies the ancestor nodes of the context node • /catalog/descendant: : title/ancestor: : cd returns the three cd nodes • /catalog/descendant: : title/ancestor: : catalog returns the catalog node • /catalog/descendant: : title/ancestor-or-self: : title returns the three title nodes, in reverse document order (? ) – <title>Greatest Hits</title> – <title>Hide your heart</title> – <title>Empire Burlesque</title> • /catalog/descendant: : title/ancestor-or-self: : cd returns the three cd nodes, again - in reverse document order (? )

following • Identifies the nodes, except for descendant nodes, attribute nodes and namespace nodes,

following • Identifies the nodes, except for descendant nodes, attribute nodes and namespace nodes, which follow the context node in document order • /catalog/descendant: : scratch/following: : * returns <price>9. 90</price> • /catalog/descendant: : scratch/parent: : cd/child: : title/ following: : * returns – <artist>Dolly Parton</artist> – <scratch> yes</scratch> – <price>9. 90</price>

following-sibling • /catalog/cd/following-sibling returns <cd country="UK"> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <price>9. 90</price> </cd>

following-sibling • /catalog/cd/following-sibling returns <cd country="UK"> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <price>9. 90</price> </cd> <cd country="USA"> <title>Greatest Hits</title> <artist>Dolly Parton</artist> <scratch> yes</scratch> <price>9. 90</price> </cd>

preceding • Identifies the nodes, except for ancestor nodes, attribute nodes and namespace nodes,

preceding • Identifies the nodes, except for ancestor nodes, attribute nodes and namespace nodes, which precede the context node in document order • /catalog/descendant: : musthave/preceding: : * returns (note the reverse document order) – <artist>Bonnie Tyler</artist> – <title>Hide your heart</title> – <price>10. 90</price> </cd> – <artist>Bob Dylan</artist> – <title>Empire Burlesque</title> – <cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10. 90</price> </cd>

preceding-sibling • /catalog/descendant: : musthave/ preceding-sibling: : * – <artist>Bonnie Tyler</artist> – <title>Hide your

preceding-sibling • /catalog/descendant: : musthave/ preceding-sibling: : * – <artist>Bonnie Tyler</artist> – <title>Hide your heart</title>

attribute • Identifies the attributes of the context node. • /catalog/cd/attribute: : * returns

attribute • Identifies the attributes of the context node. • /catalog/cd/attribute: : * returns – country="USA" – country="UK" – country="USA"

namespace • Identifies the namespace nodes of the context node.

namespace • Identifies the namespace nodes of the context node.

Node Tests • Node Test by name • Node Test by type

Node Tests • Node Test by name • Node Test by type

Node Test by name • Need to establish namespace bindings for the Xpath processor

Node Test by name • Need to establish namespace bindings for the Xpath processor (various possibilities) • If prefix: local name is used then a matching node must have the same namespace as that bound to the prefix • If a name test does not include a prefix, the identified nodes should belong to no namespace (no defaults here)

Node Test by name - Examples • Suppose prefix j is bound to namespace

Node Test by name - Examples • Suppose prefix j is bound to namespace urn: eorg: invoice • Then, child: : j: item identifies child item element nodes of the context node in the namespace urn: eorg: invoice • child: : j: * identifies child element nodes of the context node in the namespace urn: eorg: invoice • /child: : catalog identifies child catalog element nodes of the root that belong to no namespace

Node Test by type • • text() : node is a text node comment()

Node Test by type • • text() : node is a text node comment() : node is a comment node processing-instruction(target? ) node()

Node Test by type - Examples • child: : node() identifies all child nodes

Node Test by type - Examples • child: : node() identifies all child nodes of the context node regardless of type • //scratch/child: : text() returns text node yes • //scratch/text() also returns text node yes • //cd/price/text() returns 3 text nodes 10. 90 9. 90 • /catalog/comment() identifies comment child nodes of the root’s catalog child element • /processing-instruction(‘xsl-stylesheet’) identifies processing instruction child nodes of the document node that has target equal to xsl-stylesheet

Shorthand notation Long form Short form child: : attribute: : @ self: : node()

Shorthand notation Long form Short form child: : attribute: : @ self: : node() . parent: : node() . . /descendant-or-self // : : node()/ [position() = number] [number]

Shorthand notation - Examples Long form Short form /child: : catalog/child: : cd /catalog/cd

Shorthand notation - Examples Long form Short form /child: : catalog/child: : cd /catalog/cd /child: : catalog/attribute: : country /catalog/@country /self: : node()/descendant-or/. //title (how about self: : node()/child: : title //title ? ) /descendant-or-self: : node/scratch/ //scratch/. . parent: : node()

Predicates • Zero or more predicates appear, each in square brackets, following the node

Predicates • Zero or more predicates appear, each in square brackets, following the node test • A predicate may contain any expression; the result is coerced to Boolean • Each predicate is applied to each of the resulting nodes after the node test • If any evaluates to false, the node is eliminated • Otherwise, all tests are true, the node stays as a member of the node set

Predicates - Examples <catalog> <cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10. 90</price> </cd> ….

Predicates - Examples <catalog> <cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10. 90</price> </cd> …. • /catalog/cd [ artist = “Bob Dylan”] returns <cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10. 90</price> </cd>

Predicates - Examples • /catalog/cd [ position() = 1] returns <cd country="USA"> <title>Empire Burlesque</title>

Predicates - Examples • /catalog/cd [ position() = 1] returns <cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10. 90</price> </cd> • //cd[ price > 9. 90 ] returns the same • /catalog/cd [1] returns the same • /catalog[cd / scratch]/ cd[2] returns the second cd element; the predicate is satisfied because of the 3 rd cd element

Expressions • • Boolean Expressions Equality expressions Relational Expressions Numerical expressions

Expressions • • Boolean Expressions Equality expressions Relational Expressions Numerical expressions

Boolean Expressions • The operands are and, not and or • Each operand is

Boolean Expressions • The operands are and, not and or • Each operand is evaluated and converted to boolean (similar to applying boolean()) • /catalog/cd/scratch or /catalog/@country returns true • /catalog/cd [scratch and price] returns <cd country="USA"> <title>Greatest Hits</title> <artist>Dolly Parton</artist> <scratch> yes</scratch> <price>9. 90</price> </cd>

Boolean Expressions • /catalog/cd [not ( scratch and price )] returns <cd country="USA"> <title>Empire

Boolean Expressions • /catalog/cd [not ( scratch and price )] returns <cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10. 90</price> </cd> <cd country="UK"> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <musthave> yes</musthave> <price>9. 90</price> </cd> • /catalog/cd [(price < 9) or (price > 11)] returns an empty node set

Equality expressions: =, != • Equality between objects holds when they are equal •

Equality expressions: =, != • Equality between objects holds when they are equal • Equality between node sets holds when there are elements in each that have the same string value, so there is an implicit existential quantifier • Inequality between node sets holds when there are elements in each that have different string values • So, two node sets may be equal and unequal at the same time • When compared to a number (resp. , string, boolean), the string value is converted to a number (resp. , string, boolean)

Equality expressions: Examples • price = 9. 90 true if at least one child

Equality expressions: Examples • price = 9. 90 true if at least one child price element has string value that when converted to a number equals 9. 90 • price != 9. 90 true if at least one child price element has string value that when converted to a number does not equal 9. 90 what if there are no price children? • not (price = 9. 90) true if there is no price element such that when converted to a number has string value of 9. 90 what if there are no price children?

Equality expressions: Examples • not (price != 9. 90) true if there is no

Equality expressions: Examples • not (price != 9. 90) true if there is no price element such that when converted to a number has string value that is unequal to 9. 90, in other words, all price elements are such that when their string values are converted to a number, it’s 9. 90, what if there are no price children? • @country = ‘USA’ true for elements that have the value USA for their country attribute

Equality expressions: Examples • //cd [scratch = " yes"] returns <cd country="USA"> <title>Greatest Hits</title>

Equality expressions: Examples • //cd [scratch = " yes"] returns <cd country="USA"> <title>Greatest Hits</title> <artist>Dolly Parton</artist> <scratch> yes</scratch> <price>9. 90</price> </cd> • //cd [scratch != " yes"] returns an empty node set (“there exists a child of cd whose string value is different than “ yes”) • //cd [not ( scratch = " yes") ] returns (“it is not the case that there exists a child of cd whose string value is “ yes”) <cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10. 90</price> </cd> <cd country="UK"> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <musthave> yes</musthave> <price>9. 90</price> </cd>

Equality expressions: Examples • //catalog [cd [not (price = 9. 90)] ] returns <catalog>

Equality expressions: Examples • //catalog [cd [not (price = 9. 90)] ] returns <catalog> <cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10. 90</price> </cd> <cd country="UK"> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <musthave> yes</musthave> <price>9. 90</price> </cd> <cd country="USA"> <title>Greatest Hits</title> <artist>Dolly Parton</artist> <scratch> yes</scratch> <price>9. 90</price> </cd> </catalog> • //catalog [not (cd [price = 9. 90] ) ] returns an empty node set

Equality expressions: Examples • //cd [ not ( scratch) ) ] returns <cd country="USA">

Equality expressions: Examples • //cd [ not ( scratch) ) ] returns <cd country="USA"> <title>Greatest Hits</title> <artist>Dolly Parton</artist> <scratch> yes</scratch> <price>9. 90</price> </cd>

Coercions • If neither operand is a node set and the objects have different

Coercions • If neither operand is a node set and the objects have different types, coercion of the lower precedence object to the higher precedence object is performed • Order: boolean > number > string • true() = ‘joe’ is true as ‘joe’ is coerced into true • true() != 1. 50 is false as 1. 50 is coerced into true • “ 1. 56” = 1. 56 is true as “ 1. 56” is coerced into 1. 56

Relational Expressions • • <, <=, >, >= Both operands are converted into numbers

Relational Expressions • • <, <=, >, >= Both operands are converted into numbers Existential semantics as for equality price >= 50 true if there is a child price element with a price element whose string value when converted to a number is greater than or equal to 50 • price < preceding: : price true if there is a child price element whose value is smaller than the value of some preceding price element, what if there is no preceding child element?

Relational Expressions • //cd [ price < preceding: : price ] returns <cd country="UK">

Relational Expressions • //cd [ price < preceding: : price ] returns <cd country="UK"> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <musthave> yes</musthave> <price>9. 90</price> </cd> <cd country="USA"> <title>Greatest Hits</title> <artist>Dolly Parton</artist> <scratch> yes</scratch> <price>9. 90</price> </cd>

Numerical Expressions • Increasing precedence: +, -, div, mod, *, unary • Each operand

Numerical Expressions • Increasing precedence: +, -, div, mod, *, unary • Each operand is coerced into a number • 5 + 7 * 2 yields 19 • 5 + 7 * 2 = 19. 0 yields true • 5 mod 2 yields 1 • [ price div 2 = 1 ] is true for odd prices

Functions • Node-test functions: id, lang, last, local-name, namespace-uri, position • Boolean functions: boolean,

Functions • Node-test functions: id, lang, last, local-name, namespace-uri, position • Boolean functions: boolean, false, not, true • Numerical functions: ceiling, count, floor, number, round, sum • String functions: concat, contains, normalizespace, starts-with, string-length, substring, substring-after, substring-before, translate

Node-test functions • id(‘ 101’) returns the unique element with id 101 • id(‘

Node-test functions • id(‘ 101’) returns the unique element with id 101 • id(‘ 101 102’) returns the unique elements with ids 101 or 102 • When applied to a node set, each node is converted to its string value and then id is applied to each string value

Node-test functions Name Description Signature count() number of nodes in a node-set invoice [count

Node-test functions Name Description Signature count() number of nodes in a node-set invoice [count (item) > 5] number count(node-set) id() Selects elements by their unique ID, see next id (book/@similarbook) node-set id(value) last() Return position number of the last node in the node sequence invoice/item [last() > 3] number last() Note: size of context set as a whole local-name() the local part of a node (prefix: : local-name) string local-name(node) name() the Qname of a node string name(node) namespace-uri() the namespace URI of a specified node uri namespace-uri(node) position() the position in the node sequence of the node number position() /catalog/cd/node() [last()=3] [self: : title] [last() = 1]

String functions Signature & Example Name Description concat() the concatenation string concat(val 1, val

String functions Signature & Example Name Description concat() the concatenation string concat(val 1, val 2, . . ) of all its arguments Example: concat('The', 'XML') = 'The XML' /catalog/cd [concat(title, artist) = "Hide your heart. Bonnie Tyler"] contains() true if the second boolean contains(val, substr) string is contained Example: within the first contains('XML', 'X') = true string normalizespace() Removes leading string normalize-space(string) and trailing spaces Example: from a string normalize-space(' The XML ') = 'The XML'

Name Description Signature & Example startswith() true if the first string starts with the

Name Description Signature & Example startswith() true if the first string starts with the second string boolean starts-with(string, substr) Example: starts-with('XML', 'X') = true string() convert the argument to a string(value) Example: the number of characters in a string number string-length(string) Example: stringlength() substring() the part of the string argument specified in the argument by start and length string(128) = '128' string-length('Israel') = 6 string substring(string, start, length) Example: substring('Beatles', 1, 4) = 'Beat'

String functions Name Description Signature & Example substring -after() the part of the string

String functions Name Description Signature & Example substring -after() the part of the string argument that occurs after the substr argument string substring-after(string, substr) Example: substring-after('12/10', '/') = '10' substring -before() the part of the string argument that occurs before the substr argument string substring-before(string, substr) Example: substring-before('12/10', '/') = '12'

String functions Name Description Signature & Example translate() character by character replacement, the value

String functions Name Description Signature & Example translate() character by character replacement, the value argument characters contained in string 1 are each replaced, by character for the in the same position in string 2 string translate(value, string 1, string 2) Examples: translate('12: 30', '45') = '12: 45' translate('12: 30', '03', '54') = '12: 45' translate('12: 30', '0123', 'abcd') = 'bc: da'

Boolean functions Name boolean() Description Signature & Example Converts the value boolean argument to

Boolean functions Name boolean() Description Signature & Example Converts the value boolean argument to Boolean boolean(value) and returns true or false() Example: number(false())=0 lang() true if the language boolean argument matches the lang(language) language of the xsl: lang element

Boolean functions Name not() Description Signature & Example true if the condition boolean argument

Boolean functions Name not() Description Signature & Example true if the condition boolean argument is false not(condition) Example: not(false()) true() Example: number(true()) = 1

Xpath 2. 0 Data Model • A tree with the following node types –

Xpath 2. 0 Data Model • A tree with the following node types – Document (root), element, attribute, text, namespace, processing instruction, and comment • Document node at the root • Various accessors are used to characterize nodes • See http: //www. w 3. org/TR/xpath-datamodel/ which covers: XQuery 1. 0 and XPath 2. 0 Data Model, W 3 C Working Draft 02 May 2003

Accessors • Accessors are defined on Nodes. • Some accessors may return a constant

Accessors • Accessors are defined on Nodes. • Some accessors may return a constant empty sequence on certain node kinds. • There additional accessors that we do not cover. • Accessors are descriptions of the interface that an implementation of the data model must expose to applications.

Accessors • • • dm: base-uri($n as Node) as xs: any. URI? dm: node-kind($n

Accessors • • • dm: base-uri($n as Node) as xs: any. URI? dm: node-kind($n as Node) as xs: string dm: node-name($n as Node) as xs: QName? dm: parent($n as Node) as Node? dm: string-value($n as Node) as xs: string dm: typed-value($n as Node) as xdt: any. Atomic. Type* dm: type($n as Node) as xs: QName? dm: children($n as Node) as Node* dm: attributes($n as Node) as Attribute. Node* dm: namespaces($n as Node) as Namespace. Node* dm: nilled($n as Node) as xs: boolean

Data Model File Example <? xml version="1. 0"? > <? xml-stylesheet type="text/xsl" href="dm-example. xsl"?

Data Model File Example <? xml version="1. 0"? > <? xml-stylesheet type="text/xsl" href="dm-example. xsl"? > <catalog xmlns="http: //www. example. com/catalog" xmlns: html="http: //www. w 3. org/1999/xhtml" xmlns: xlink="http: //www. w 3. org/1999/xlink" xmlns: xsi="http: //www. w 3. org/2001/XMLSchemainstance" xsi: schema. Location="http: //www. example. com/catalog dm-example. xsd" version="0. 1">

More Data <tshirt code="T 1534017" label=" Staind : Been Awhile " xlink: href="http: //example.

More Data <tshirt code="T 1534017" label=" Staind : Been Awhile " xlink: href="http: //example. com/0, , 1655091, 00. html" sizes="M L XL"> <title> Staind: Been Awhile Tee Black (1 -sided) </title> <description> <html: p> Lyrics from the hit song 'It's Been Awhile' are shown in white, beneath the large 'Flock & Weld' Staind logo. A very unique logo that looks as cool as it feels! </html: p> </description> <price> 25. 00 </price> </tshirt>

Cont. <album code="A 1481344" label=" Staind : Its Been A While " formats="CD"> <title>

Cont. <album code="A 1481344" label=" Staind : Its Been A While " formats="CD"> <title> It's Been A While </title> <description xsi: nil="true" /> <price currency="USD"> 10. 99 </price> <artist> Staind </artist> </album> </catalog>

Data Model Nodes • // Document node D 1 • dm: baseuri(D 1)= xs:

Data Model Nodes • // Document node D 1 • dm: baseuri(D 1)= xs: any. URI("http: //www. example. com/catalog. xm l") • dm: string-value(D 1)=" Staind: Been Awhile Tee Black (1 -sided) n Lyrics from the hit song 'It's Been Awhile'n are shown in white, beneath the largen 'Flock & Weld' Staind logo. A very uniquen logo that looks as cool as it feels!n 25. 00 It's Been A While 10. 99 Staind “ • dm: children(D 1)= ([E 1])

Data Model Nodes: Namespace Nodes • • // Namespace node N 1 dm: node-kind(N

Data Model Nodes: Namespace Nodes • • // Namespace node N 1 dm: node-kind(N 1)= "namespace“ dm: node-name(N 1)= xs: QName("", "xml") dm: stringvalue(N 1)=http: //www. w 3. org/XML/1998/n amespace • Similarly for N 2, N 3, N 4 and N 5.

Data Model Nodes: Processing Instruction Nodes • // Processing Instruction node P 1 •

Data Model Nodes: Processing Instruction Nodes • // Processing Instruction node P 1 • dm: baseuri(P 1)= xs: any. URI("http: //www. example. com/cat alog. xml") • dm: node-kind(P 1)= "processing-instruction“ • dm: node-name(P 1)= xs: QName("", "xmlstylesheet") • dm: string-value(P 1)="type="text/xsl" href="dmexample. xsl"“ • dm: parent(P 1)= ([D 1])

Data Model Nodes: Element Nodes • • • // Element node E 1 dm:

Data Model Nodes: Element Nodes • • • // Element node E 1 dm: base-uri(E 1)= xs: any. URI("http: //www. example. com/catalog. xml") dm: node-kind(E 1)= "element"dm: nodename(E 1)= xs: QName("http: //www. example. com/catalog", "catalog") • dm: string-value(E 1)=" Staind: Been Awhile Tee Black (1 -sided) n Lyrics from the hit song 'It's Been Awhile'n are shown in white, beneath the largen 'Flock & Weld' Staind logo. A very uniquen logo that looks as cool as it feels!n 25. 00 It's Been A While 10. 99 Staind “ • dm: typed-value(E 1)= fn: error() // xs: any. Type because of the anonymous type definition • dm: type(E 1)= xs: any. Type • dm: parent(E 1)= ([D 1]) • dm: children(E 1)= ([E 2], [E 7]) • dm: attributes(E 1)= ([A 1], [A 2]) • dm: namespaces(E 1)= ([N 1], [N 2], [N 3], [N 4], [N 5])

Data Model Nodes: Attribute Nodes // Attribute node A 1 dm: node-kind(A 1)= "attribute“

Data Model Nodes: Attribute Nodes // Attribute node A 1 dm: node-kind(A 1)= "attribute“ dm: nodename(A 1)= xs: QName("http: //www. w 3. org/2001/XMLSch ema-instance", "xsi: schema. Location") dm: string-value(A 1)="http: //www. example. com/catalog dm-example. xsd“ dm: typedvalue(A 1)= (xs: any. URI("http: //www. example. com/catalog "), xs: any. URI("catalog. xsd")) dm: type(A 1)= xs: any. Simple. Type dm: parent(A 1)= ([E 1])

Summary of (Some) Accessors • We cover some of the accessors, the rest are

Summary of (Some) Accessors • We cover some of the accessors, the rest are summarized at http: //www. w 3. org/TR/xpath-datamodel/

dm: base-uri On node type Returns: Documents The value of the base-uri property Elements

dm: base-uri On node type Returns: Documents The value of the base-uri property Elements The value of the base-uri property or its parent's base URI Attributes () Namespaces () Processing Instructions The value of the base-uri property or its parent's base URI Comments The base URI of its parent Text The base URI of its parent

dm: node-kind On node type Returns: Documents "document" Elements "element" Attributes "attribute" Namespaces "namespace"

dm: node-kind On node type Returns: Documents "document" Elements "element" Attributes "attribute" Namespaces "namespace" Processing Instructions "processing-instruction" Comments "comment" Text "text"

dm: node-name On node type Returns: Documents () Elements The xs: QName of the

dm: node-name On node type Returns: Documents () Elements The xs: QName of the element Attributes The xs: QName of the attribute Namespaces A xs: QName with the namespace prefix in the localname and an empty URI Processing Instructions A xs: QName with the processing-instruction target in the local-name and an empty URI Comments () Text ()

dm: parent On node type Returns: Documents () Elements The parent element or document

dm: parent On node type Returns: Documents () Elements The parent element or document node Attributes The parent element node Namespaces The parent element node Processing Instructions The parent element or document node Comments The parent element or document node Text The parent element or document node

dm: string-value On node type Returns: Documents The concatenation of the string-values of all

dm: string-value On node type Returns: Documents The concatenation of the string-values of all the text node descendants of the document in document order Elements The concatenation of the string-values of all the text node descendants of the element in document order Attributes The value of the attribute Namespaces The namespace name (URI) of the node Processing Instructions The content of the processing-instruction Comments The content of the comment Text The text content