Appendix A XML and XML Schema ServiceOriented Computing

  • Slides: 35
Download presentation
Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P.

Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005 Appendix A Service-Oriented Computing: Semantics, Processes, Agents

Highlights of this Chapter n n n n n Appendix A XML and Vocabularies

Highlights of this Chapter n n n n n Appendix A XML and Vocabularies Well-Formedness Namespaces and Qualified Names XML Extensions XML Schema XML Query Languages XPath XSLT Limitations Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 2

Brief Introduction to XML n n Appendix A Basics Parsing Storage Transformations Service-Oriented Computing:

Brief Introduction to XML n n Appendix A Basics Parsing Storage Transformations Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 3

Markup History n n n None Ad hoc tags SGML (Standard Generalized Markup L):

Markup History n n n None Ad hoc tags SGML (Standard Generalized Markup L): complex, few reliable tools HTML (Hyper. Text ML): simple, unprincipled, mixes structure and display XML (e. Xtensible ML): simple, yet extensible subset of SGML to capture new vocabularies n n Appendix A Machine processible Comprehensible to people: easier debugging Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 4

XML Basics and Namespaces <? xml version="1. 0"? > <!– not part of the

XML Basics and Namespaces <? xml version="1. 0"? > <!– not part of the document per se <arbitrary: toptag xmlns=“http: //one. default. namespace/if-needed” xmlns: arbitrary=“http: //wherever. it. might. be/arbit-ns” xmlns: random=“http: //another. one/random-ns”> <arbitrary: atag attr 1=“v 1” attr 2=“v 2”> Optional text also known as PCDATA <arbitrary: btag attr 1=“v 1” attr 2=“v 2” /> </arbitrary: atag> <random: simple_tag/> <random: atag attr 3=“v 3”/> <!– compare with arbitrary: atag above </arbitrary: toptag> Appendix A Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 5

Parsing and Validating n An XML document maps to a parse tree. n n

Parsing and Validating n An XML document maps to a parse tree. n n Each tag ends once: nesting structure (one root) Each attribute occurs at most once; quoted string Well-formed XML documents can be parsed Applications have an explicit or implicit syntax for their particular XML-based tags n If explicit, may be expressed in DTDs and XML Schemas n n n Appendix A Best referred to definitions elsewhere XML Schemas, expressed in XML, are superior to DTDs When docs are produced by external components, they should be validated Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 6

XML Schema n A data definition language for XML: defines a notion of schema

XML Schema n A data definition language for XML: defines a notion of schema validity n n Same syntax as regular XML documents Local scoping of subelement names Incorporates namespaces Types n n n Appendix A Primitive (built-in): string, integer, float, date, … Primitive (built-in): ID (key), IDREF (foreign key) simple. Type constructors: list, union Restrictions: intervals, lengths, enumerations, regex patterns, Flexible ordering of elements Key and referential integrity constraints Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 7

XML Schema: complex. Type n Specifies types of elements with structure: n n n

XML Schema: complex. Type n Specifies types of elements with structure: n n n Appendix A Must use a compositor if ¸ 1 subelements Subelements with types Min and max occurrences (default 1) of subelements Elements with text content not easy: ignore EMPTY elements: easy. Example? Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 8

XML Schema: Compositors n Sequence: ordered n n n All: unordered n n n

XML Schema: Compositors n Sequence: ordered n n n All: unordered n n n Must occur directly below root element Max occurrence of each element is 1 Choice: exclusive or n Appendix A Can occur within other compositors Allows varying min and max occurrence Can occur within other compositors Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 9

XML Schema: Key Namespaces n http: //www. w 3. org/2001/XMLSchema n n http: //www.

XML Schema: Key Namespaces n http: //www. w 3. org/2001/XMLSchema n n http: //www. w 3. org/2001/XMLSchemainstance n n n Appendix A Conventional prefix: xsd Terms for defining schemas: schema, element, attribute, … The tag schema has an attribute target. Namespace Conventional prefix: xsi Terms for use in instances: schema. Location, null target. Namespace: user-defined Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 10

XML Schema Instance Doc <Music xmlns=http: //a. b. c/Muse xmlns: xsi=“the standard-xsi” xsi: schema.

XML Schema Instance Doc <Music xmlns=http: //a. b. c/Muse xmlns: xsi=“the standard-xsi” xsi: schema. Location=“a-schema-as-a-URI a-schema-location-as-a-URL”> … </Music> Define null values as <a. Tag xsi: nil=“true”/> Appendix A Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 11

Creating Schema Docs: 1 <schema xmlns=“the-standard-xsd” target. Namespace=“the-target”> <include schema. Location=“part-one. xsd”/> <include schema.

Creating Schema Docs: 1 <schema xmlns=“the-standard-xsd” target. Namespace=“the-target”> <include schema. Location=“part-one. xsd”/> <include schema. Location=“part-two. xsd”/> <!– schema. Location as in xsd, not xsi </schema> Included into the same namespace as the including space. Appendix A Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 12

Creating Schema Docs: 2 n Use imports instead of include n n Appendix A

Creating Schema Docs: 2 n Use imports instead of include n n Appendix A Specify namespaces from which schemas are to be imported Location of schemas not required and may be ignored if provided Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 13

Document Object Model (DOM) n Basis for parsing XML, which provides a nodelabeled tree

Document Object Model (DOM) n Basis for parsing XML, which provides a nodelabeled tree in its API n n Appendix A Conceptually simple: traverse by requesting tag, its attribute values, and its children Processing program reflects document structure Can edit documents Inefficient for large documents: parses them first entirely to build the tree even if a tiny part is needed Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 14

DOM Example [Simeoni 2003] Element s = d. get. Document. Element(); Node. List l

DOM Example [Simeoni 2003] Element s = d. get. Document. Element(); Node. List l = s. get. Elements. By. Tag. Name(“member”); Element m = (Element) l. item(0); int code = m. get. Attribute(“code”); Node. List kids = m. get. Child. Nodes(); Node kid = kids. item(0); String tag. Name = ((Element)kid). get. Tag. Name(); … Appendix A Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 15

Simple API for XML (SAX) n Parser generates a sequence of events: n n

Simple API for XML (SAX) n Parser generates a sequence of events: n n Programmer implements these as callbacks n n Appendix A start. Element, end. Element, … More control for the programmer Processing program does not reflect document structure Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 16

SAX Example [Simeoni 2003] class Member. Process extends Default. Handler { public void start.

SAX Example [Simeoni 2003] class Member. Process extends Default. Handler { public void start. Element (String uri, String n, String q. Name, Attributes attrs) { if (n. equals(“member”)) code = attrs. get. Value(“code”); if (n. equals(“project”)) in. Project = true; buffer. reset(); } public void end. Element (String uri, String n, String q. Name) { if (n. equals(“project”)) in. Project = false; if (n. equals(“member”) && !in. Project) name = buffer. to. String(). trim(); } } Appendix A Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 17

Programming with XML n Current approaches concentrate on structure but ignore meaning n n

Programming with XML n Current approaches concentrate on structure but ignore meaning n n Emerging approaches (e. g. , JAXB) provide superior binding from XML to programming languages n Appendix A Difficult to construct and maintain Treat everything as a string Inadequate type checking can hide errors Primitives such as unmarshal to materialize an object from XML Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 18

Uses of XML n n n Exchanging information across software components Storing information in

Uses of XML n n n Exchanging information across software components Storing information in nonproprietary format XML documents represent structured descriptions: n n Products, services, catalogs Contracts Queries, requests, invocations (as in SOAP) Data-centric versus document-centric (irregular, heterogeneous data, depend on entire doc for app-specific meaning) views Appendix A Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 19

Data-Centric View <relation> <tuple><attr 1>V 11</attr 1>… <attrn>V 1 n</attrn></tuple> … <tuple><attr 1>Vm 1</attr

Data-Centric View <relation> <tuple><attr 1>V 11</attr 1>… <attrn>V 1 n</attrn></tuple> … <tuple><attr 1>Vm 1</attr 1>… <attrn>Vmn</attrn></tuple> </relation> n n n Extract and store into DB via mapping to DB model Regular, homogeneous tags May be expensive if repeatedly parsed and instantiated Appendix A Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 20

Document-Centric View n Storing docs in DBs n n Appendix A Use character large

Document-Centric View n Storing docs in DBs n n Appendix A Use character large objects (clobs) within DB Store paths to external files containing docs Combine with some structured elements with search conditions for both structured elements and unstructured clobs or files Heterogeneity also complicates mappings to traditional typed OO programming languages Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 21

Directions n Limitations of XML n n n Doesn’t represent meaning Enables multiple representations

Directions n Limitations of XML n n n Doesn’t represent meaning Enables multiple representations for the same information; transform if models known Trends: sophisticated approaches for n n n Appendix A Querying and manipulating XML, e. g. , XSLT Binding to PLs and DBs Semantics, e. g. , RDF, DAML, OWL, … Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 22

XML Query Languages n n Appendix A XPath XPointer XSLT XQuery Service-Oriented Computing: Semantics,

XML Query Languages n n Appendix A XPath XPointer XSLT XQuery Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 23

XPath n Model XML documents as trees with nodes n n n Appendix A

XPath n Model XML documents as trees with nodes n n n Appendix A Elements Attributes Text (PCDATA) Comments Root node: above root of document Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 24

Achtung! n n Parent in XPath is like parent as traditionally in computer science

Achtung! n n Parent in XPath is like parent as traditionally in computer science Child in XPath is confusing: n n n An attribute is not the child of its parent Makes a difference for certain kinds of recursion (e. g. , apply-templates discussed in XSLT) Our terminology is based on the traditional terminology: n n Appendix A e-children, a-children, t-children Sets via et- or ta-, etc. Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 25

XPaths n n n n Leading /: root /: indicates walking down a tree.

XPaths n n n n Leading /: root /: indicates walking down a tree. : current node. . : parent node @attr: to access values for the given attribute text() comment() Appendix A Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 26

XPath Navigation n n Select children according to position, e. g. , [j], where

XPath Navigation n n Select children according to position, e. g. , [j], where j could be 1 … last() Descendant-or-self operator, // n n . //elem finds all elems under the current //elem finds all elems in the document Ancestors: not needed in this course Wildcard, *: n n Appendix A collects e-children of the node where it is applied, but omits the t-children @*: finds all attribute values Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 27

XPath Queries n Incorporate selection conditions in XPath n n n n n Appendix

XPath Queries n Incorporate selection conditions in XPath n n n n n Appendix A Attributes: //Song[@genre=“jazz”] Elements: //Song[starts-with(. //group, “Led”)] Existence of attribute: //Song[@genre] Existence of subelement: //Song[group] Boolean operators: and, not, or Set operator: union (|); none others Arithmetic operators: >, <, … String functions: contains(), concat(), length(), Aggregates: sum(), count() Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 28

XPointer n n n Combines XPath with URLs URL to get to a document;

XPointer n n n Combines XPath with URLs URL to get to a document; XPath to walk down the document Can be used to formulate queries, e. g. , n Appendix A Song. URL#xpointer(//Song[@genre=“jazz”]) Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 29

XSLT n n A functional programming language A stylesheet specifies transformations on a document

XSLT n n A functional programming language A stylesheet specifies transformations on a document <? xml version=“ 1. 0”? > <? xml-stylesheet type=“text/xsl” href=“URL-to-dot-xsl”? > <!– the sheet to use <main-tag> … </main-tag> Appendix A Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 30

XSLT Stylesheets n Use the XSLT namespace, conventionally abbreviated as xsl Includes primitives: n

XSLT Stylesheets n Use the XSLT namespace, conventionally abbreviated as xsl Includes primitives: n n Appendix A Copy-of <for-each select=“…”> <if test=“…”> <choose > Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 31

XSLT Templates: 1 n A pattern to specify where a given transform should apply

XSLT Templates: 1 n A pattern to specify where a given transform should apply This match only works on the root: <xsl: template match=“/”> … </xsl: template> n Only anonymous templates in this course n Appendix A Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 32

XSLT Templates: 2 n Can be applied recursively on the et-children via <xsl: apply-templates/>

XSLT Templates: 2 n Can be applied recursively on the et-children via <xsl: apply-templates/> n By default, if no other template matches, recursively apply to et-children of current node (ignores attributed) and to root: <xsl: template match=“*|/”> <xsl: apply-templates/> </xsl: template> n Can over-apply; to override the default, may need an empty template: <xsl: template match=“…”/> <!– e. g. , match all text() Appendix A Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 33

XSLT Templates: 3 n n Appendix A Subtleties of XSLT matching are beyond our

XSLT Templates: 3 n n Appendix A Subtleties of XSLT matching are beyond our scope Discuss some examples Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 34

Appendix A Summary n n XML enables information sharing XML is well established n

Appendix A Summary n n XML enables information sharing XML is well established n n Appendix A Several aspects are worked out Lots of tools Works with databases and programming languages XML provides a useful substrate for service-oriented computing Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns 35