Processing of structured documents Part 3 XML Schema







































![Predicates z. Attribute tests ypara[@type=’secret’] xevery ’para’ element with a ’type’ attribute value of Predicates z. Attribute tests ypara[@type=’secret’] xevery ’para’ element with a ’type’ attribute value of](https://slidetodoc.com/presentation_image_h2/b2e16f67d5c673979e231bc667443061/image-40.jpg)









![Examples z para[@type=”warning”] selects all para children of the context node that have a Examples z para[@type=”warning”] selects all para children of the context node that have a](https://slidetodoc.com/presentation_image_h2/b2e16f67d5c673979e231bc667443061/image-50.jpg)
![Examples z chapter[title=”Introduction”] selects the chapter children of the context node that have one Examples z chapter[title=”Introduction”] selects the chapter children of the context node that have one](https://slidetodoc.com/presentation_image_h2/b2e16f67d5c673979e231bc667443061/image-51.jpg)
- Slides: 51
Processing of structured documents Part 3
XML Schema (continues…) z Building content models… z a simplified view of the allowed structure of a complex type ycomplex. Type -> annotations? , (simple. Content | complex. Content | ((all | choice | sequence | group)? , attr. Decls)) 2
Nested choice and sequence groups <xsd: complex. Type name=”Purchase. Order. Type”> <xsd: sequence> <xsd: choice> <xsd: group ref=”ship. And. Bill” /> <xsd: element name=”single. USAddress” type=”USAddress” /> </xsd: choice> <xsd: element name=”items” type=”Items” /> </xsd: sequence> 3
Nested choice and sequence groups <xsd: group name=”ship. And. Bill”> <xsd: sequence> <xsd: element name=”ship. To” type=”USAddress” /> <xsd: element name=”bill. To” type=”USAddress” /> </xsd: sequence> </xsd: group> 4
An ’all’ group z An all group: all the elements in the group may appear once or not at all, and they may appear in any order ymin. Occurrs and max. Occurs can be 0 or 1 z limited to the top-level of any content model z has to be the only child at the top z group’s children must all be individual elements (no groups), and no element in the content model may appear more than once 5
An ’all’ group <xsd: complex. Type name=”Purchase. Order. Type”> <xsd: all> <xsd: element name=”ship. To” type=”USAddress” /> <xsd: element name=”bill. To” type=”USAddress” /> <xsd: element ref=”comment” min. Occurs=” 0” /> <xsd: element name=”items” type=”Items” /> </xsd: all> <xsd: attribute name=”order. Date” type=”xsd: date” /> </xsd: complex. Type> 6
Occurrence constraints z. Groups represented by ’group’, ’choice’, ’sequence’ and ’all’ may carry min. Occurs and max. Occurs attributes zby combining and nesting the various groups, and by setting the values of min. Occurs and max. Occurs, it is possible to represent any content model expressible with an XML 1. 0 DTD y’all’ group provides additional expressive power 7
Attribute groups z Also attribute definitions can be grouped and named <xsd: element name=”item” > <xsd: complex. Type> <xsd: sequence> … </xsd: sequence> <xsd: attribute. Group ref=”Item. Delivery” /> </xsd: complex. Type></xsd: element> <xsd: attribute. Group name=”Item. Delivery”> <xsd: attribute name=”part. Num” type=”SKU” /> … </xsd: attribute. Group> 8
XML Path Language (XPath) z. The ability to navigate through XML documents is needed in many applications of XML yquerying of XML documents ycreation of hypertext links to objects that do not have unique identifiers yformatting of document components for presentation 9
XML Path Language (XPath) z XPath provides ycommon syntax and semantics to address parts of an XML document ybasic facilities for manipulation of strings, numbers and booleans z XPath uses a compact, non-XML syntax to facilitate use of XPath within URIs and XML attribute values 10
XML Path Language (XPath) z. Use e. g. as a pattern in XSLT: <xsl: template match=”chapter/title”> … </xsl: template> 11
XML Path Language (XPath) z. XPath operates on an XML document as a tree zevery element in an XML document has a specific and unique contextual location yany element in the document can be identified by the steps it would take to reach it, either from the root element, or from some other fixed starting location 12
Data model of XPath z A conceptual model: no particular implementation is assumed z A tree contains nodes (7 types): yroot nodes yelement nodes ytext nodes yattribute nodes ynamespace nodes yprocessing instruction nodes ycomment nodes 13
Data model z Every node has a string-value z document order defined on all the nodes in the document: yroot node is the first node yelement nodes in order of the occurrence of their start tags yattribute nodes and namespace nodes before the children of the element ynamespace nodes before attribute nodes z parent - child, ancestor - descendant 14
Root node z The root of the tree z the element node for the document element is a child of the root node z other children: yprocessing instruction nodes ycomment nodes z string-value: concatenation of the string-values of all text node descendants of the root node in document order 15
Element nodes z An element node for every element in the document z children: yelement nodes (subelements) ycomment nodes yprocessing instruction nodes ytext nodes (content) z entity references are expanded z string-value: yconcatenation of the string-values of all text node descendants of the element node in document order 16
Attribute nodes z Each element node has an associated set of attribute nodes ythe element node is the parent of each of these attribute nodes ybut: an attribute node is not a child of its parent element z a defaulted attribute is treated the same as a specified attribute 17
Attribute nodes z if an attribute was declared for the element with the default #IMPLIED, but the attribute was not specified on the element, there is no attribute node for this attribute z String-value: the normalized value as specified by the XML specification 18
Namespace nodes z Each element has an associated set of namespace nodes yone for each distinct namespace prefix that is in scope for the element yone for the default namespace if one is in scope for the element z The element is the parent of each of these namespace nodes, but a namespace node is not a child of its parent element z string-value: the namespace URI 19
PI nodes, comment nodes z. There is a processing instruction node for every processing instruction zthere is a comment node for every comment ystring-value: the content of the comment not including <!-- and --> z… except for PIs and comments in document type declarations 20
Text nodes z Character data is grouped into text nodes z as much character data as possible is grouped into each text node z string-value: the character data z characters inside comments, processing instructions and attribute values do not produce text nodes 21
Expressions z The primary syntactic construct in XPath is the expression z an expression is evaluated to yield an object, which has one of the following types ynode-set (unordered) yboolean (true or false) ynumber ystring 22
Location paths z relative location paths ya path that starts from an existing location ysequence of one or more location steps separated by / ysteps are composed from left to right ythe initial step selects a set of nodes relative to the context node yeach node in this set is used as a context node for the following step ythe sets of nodes identified by that step are unioned together ye. g. child: div/child: para 23
Location paths z An absolute location path consists of / optionally followed by a relative location path z A / by itself selects the root node of the document z if / is followed by a relative path, then the location path selects the set of nodes that would be selected by the relative location path relative to the root node 24
Location steps z A location step has three parts yan axis: the tree relationship between the nodes selected by the location step and the context node ya node test: the node type and name of the nodes selected by the location step yzero or more predicates, which use arbitrary expressions to further refine the set of nodes selected by the location step z syntax: yaxis: : node-test[expr]… ye. g. child: : para[position()=1] 25
Location steps z The node-set selected by the location step is the node-set that results from ygenerating an initial node-set from the axis and node -test, and then yfiltering that node-set by each of the predicates in turn z the initial node-set consists of the nodes yhaving the relationship to the context node specified by the axis, and yhaving the node type and name specified by the node test 26
Axes z child z descendant z parent z ancestor z following-sibling yempty, if the context node is an attribute node or namespace node z preceding-sibling yempty, if the context node is an attribute node or namespace node 27
Axes z following yall nodes in the same document as the context node that are after the context node in document order, excluding any descendants and excluding attribute nodes and namespace nodes z preceding yall nodes in the same document as the context node that are before the context node in document order, excluding any ancestors and excluding attribute nodes and namespace nodes 28
Axes z attribute yattribute nodes of the context node yempty unless the context node is an element z namespace ynamespace nodes of the context node yempty unless the context node is an element z self ythe context node itself z descendant-or-self, ancestor-or-self 29
Axes z The ancestor, descendant, following, preceding and self axes partition a document (ignoring attribute and namespace nodes): they do not overlap and together they contain all the nodes in the document 30
Node tests z Every axis has a principal node type yfor the attribute axis: attribute yfor the namespace axis: name space yfor other axes: element z a node test yboth name and type have to match ychild: : para xselects the para element children of the context node xif the context node has no para children, it will select an empty set of nodes 31
Node tests z. Function node() represents any node zfunction text(), comment(), and processing-instruction() represent any object of these specific types 32
Node tests z A node test * is true for any node of the principal node type ychild: : * xselects all element children of the context node yattribute: : * xselects all attributes of the context node z text() ytrue for any text node z comment() z processing-instruction() ymay have an argument = name of the PI 33
Abbreviated syntax z child: : -> can be omitted from a location step; child is the default axis ychild: : div/child: : para -> div/para z attribute: : -> @ ychild: : para[attribute: : type=”warning”] -> para[@type=”warning”] z /descendant-or-self: : node()/ -> // y//para selects any para element in the document ydiv//para selects all para descendants of div children (of the context node) 34
Abbreviated syntax z self: : node() ->. (fullstop) y. //para selects all para descendant elements of the context node z parent: : node() ->. . y. . /title selects the title children of the parent of the context node 35
Predicates z An axis is either a forward axis or a reverse axis yforward axis: an axis that only ever contains the context node or nodes that are after the context node in document order yreverse axis: an axis that only ever contains the context node or nodes that are before the context node in document order 36
Predicates z the proximity position of a member of a nodeset with respect to an axis: ythe position of the node in the node-set ordered in xdocument order if the axis is a forward axis xreverse order if the axis is a reverse axis ythe first position is 1 z a predicate filters a node-set to produce a new node-set yfor each node in the node-set, the predicate expression is evaluated with that node as the context node and with the proximity position of the node in the node-set 37
Predicates z If the predicate expression evaluates to true for that node, the node is included in the new nodeset z the result of the evaluation is converted to a boolean yif the result is a number, the result is true if the number is equal to the context position yotherwise, the result will be converted as if by a call to the function boolean (see below) ye. g. para[3] equals para[position()=3] 38
Predicates z. Contained element tests ythe name of an element can appear in a predicate filter -> represents an element that must be present as a child ynote[title] xa note element is only selected if it directly contains a title element ynote[title=”first note”] xtrue, if the content of the element is ’first note’ ynote[id(” 123”)] 39
Predicates z. Attribute tests ypara[@type=’secret’] xevery ’para’ element with a ’type’ attribute value of ’secret’ 40
Expressions z boolean operators: or, and z comparisons: =, !=, <, >=, > yin XML documents: < has to be converted to < z numeric operators: +, -, *, div, mod 41
Core functions z. Node set functions ynumber last() ynumber position() ynumber count(node-set) ynode-set id(object) xe. g. id(”foo”) selects the element with unique ID foo 42
Core functions z. String functions ystring(object? ) xconvert an object to a string xe. g. negative infinity -> -Infinity ystring concat(string, string*) xreturns the concatenation of its arguments yboolean starts-with(string, string) xreturns true if the first argument string starts with the second argument string 43
Core functions z. String functions yboolean contains(string, string) xreturns true if the first string contains the second string ystring substring-before(string, string) xe. g. substring-before(” 1999/04/01”, ”/”) returns 1999 ystring substring-after(string, string) ystring substring(string, number? ) xe. g. substring(” 12345”, 2, 3) returns ” 234” xe. g. substring(” 12345”, 2) returns ” 2345” 44
Core functions z. String functions ynumber string-length(string? ) xdefault: the string-value of the context node ystring normalize-space(string? ) xreturns the string with whitespace normalized by stripping leading and trailing whitespace and replacing sequences of whitespace characters by a single space ystring translate(string, string) xe. g. translate(”bar”, ”abc”, ”ABC”) returns BA r 45
Core functions z. Boolean functions yboolean(object) xconvert the argument to a boolean xe. g. a number is true if and only if it is neither positive or negative zero nor Na. N (not-a-number) xe. g. a node-set is true iff it is non-empty yboolean not(boolean) yboolean true(), boolean false() yboolean lang(string) xattribute xml: lang 46
Core functions z. Number functions ynumber(object? ) xconverts its argument to a number xe. g. boolean true -> 1; boolean false -> 0 xe. g. a string -> mathematical value or Na. N ynumber sum(node-set) ynumber floor(number), number ceiling(number), number round(number) 47
Examples z para selects the para element children of the context node z* selects all element children z text() selects all text node children z @name selects the name attribute z @* selects all the attributes z para[1] selects the first para child z para[last()] selects the last para child z */para selects all para grandchildren z /doc/chapter[5]/section[2] selects the second section of the fifth chapter of the doc (root) 48
Examples z chapter//para selects the para element descendants of the chapter element children z //para selects all the para descendants of the document root and thus selects all the para elements in the same document as the context node z //olist/item selects all the item elements in the same document as the context node that have an olist parent z. selects the context node z. //para selects the para element descendants z. . selects the parent z. . /@lang selects the lang attribute of the parent 49
Examples z para[@type=”warning”] selects all para children of the context node that have a type attribute with value warning z para[@type=”warning”][5] selects the fifth para child of the context node that has a type attribute with value warning z para[5][@type=”warning”] selects the fifth para child of the context node if that child has a type attribute with value warning 50
Examples z chapter[title=”Introduction”] selects the chapter children of the context node that have one or more title children with string-value equal to Introduction z chapter[title] selects the chapter children of the context node that have one or more title children z employee[@secretary and @assistant] selects all the employee children of the context node that have both a secretary attribute and an assistant attribute 51