ebusiness XML XSD and XSL Bryan Hogan IBM
ebusiness XML, XSD, and XSL Bryan Hogan IBM
Terminology ebusiness n World Wide Web Consortium (W 3 C) Ø develops interoperable technologies (specifications, guidelines, software, and tools) to lead the Web to its full potential as a forum for information, commerce, communication, and collective understanding. (www. w 3. org) n Document Object Model (DOM) Ø a W 3 C standard API which describes mechanisms for software developers and Web script authors to access and manipulate parsed XML (and HTML) content. The DOM is both platform-neutral and language-neutral n Document Type Definition (DTD) Ø a specification of the elements and attributes that are permitted in an XML document n XML Schema Ø a specification of the elements and attributes that are permitted in an XML document along with the datatypes for these artifacts
Terminology (continued) ebusiness n Well-formed XML document Ø an XML document that conforms to basic XML rules n Valid XML document Ø a well-formed XML document that conforms to the rules specified in a DTD n Simple API for XML (SAX) Ø a standard interface for event-based XML parsing n Extensible Stylesheet Language Transformations (XSLT) Ø A language for transforming XML documents via the application of stylesheets
What is XML? ebusiness n e. Xtensible Markup Language Ø Markup language for describing data Ø Simpler than predecessor SGML (Standard Generalized Markup Language) Ø More versatile than HTML n (Hyper. Text Markup Language) Self-describing: XML documents can describe other XML documents (ie. XML schema) An open standard for defining and sharing data across diverse network topologies Mark Weitzel (IBM)
Why use XML? ebusiness n n n XML data representation is human-readable, application-neutral, and language-neutral enabling universal interchange of data XML documents provide an intuitive mechanism for initializing structured data within an application. XML standard is open; therefore, costs are nominal
HTML and XML side by side ebusiness <table border cellspacing=0 cellpadding=5> <tr> <th>Team name</th> <th>Score</th> </tr> <td>Clemson</td> <td>15</td> </tr> <td>NCSU</td> <td>17</td> </tr> </table> <football_game> <home> <school>NCSU</school> <score>17</score> </home> <visitor> <school>Clemson</school> <score>15</score> </visitor> </football_game>
XML document syntax ebusiness v Element start/end tags <tag 1></tag 1> or <tag 1/> <tag 1></TAG 1> <!– syntax error. v XML is case sensitive Attributes <tag 1 attribute 1=“test. Value” /> <tag 1 enabled /> <!– syntax error. Allowed in HTML not XML v Comments <!– This is an XML comment v Entity references <tag 1 attr 1="&Entity 1; "> v Processing instructions <? xml version="1. 0"? >
XML document syntax ebusiness v Character data sections (CDATA) <![CDATA [<tag 1>test</tag 1>] ]> v Document type declarations <!DOCTYPE Xml. Mapping. Spec SYSTEM "abtxmap. dtd" > <!DOCTYPE Xml. Mapping. Spec SYSTEM "abtxmap. dtd" [ <!ENTITY entity 1 “test. Value 1" > <!ENTITY entity 2 “test. Value 2" > ]>
ebusiness DTD syntax and terminology v Element type declarations <!ELEMENT Street (#PCDATA) > Usage: <Street>29 Oak Street</Street> v Attribute list declarations <!ATTLIST name first. Name CDATA #REQUIRED > <!ATTLIST car maker (Ford | GM | BMW) > Usage: <name first. Name=“George” /> <car maker=“Ford” /> <!-- Not valid. Validating parser will flag error <car maker=“Mercury” />
ebusiness DTD syntax and terminology v Entity declarations <!ENTITY IBM “International Business Machines” > <!ENTITY test. Doc SYSTEM “http: //mywebsite/test. Doc. xml” > Usage: <Company>&IBM; </Company> <!– Inline the contents of the test. Doc ENTITY <root>&test. Doc; </root> v Parameter entity declaration <!ENTITY % code_format “CDATA”> v Notations declarations <!NOTATION Find_Help SYSTEM “Help System” >
DTD ELEMENT examples ebusiness v v v An element with multiple required subelements. <!ELEMENT main (sub 1, sub 2, sub 3) > A subelement (sub 2) that occurs once or not at all. <!ELEMENT main (sub 1, sub 2? ) > A subelement (sub 2) that occurs one or more times. <!ELEMENT main (sub 1, sub 2+) > A subelement (sub 2) that occurs zero or more times. <!ELEMENT main (sub 1, sub 2*) > An element that contains one of multiple elements. <!ELEMENT main (choice 1 | choice 2 | choice 3) >
DTD ATTLIST examples ebusiness v v #REQUIRED default indicates that an attribute must be specified in XML document instance. <!ATTLIST main attr 1 CDATA #REQUIRED > #IMPLIED default indicates that an attribute is not required by the XML document instance. <!ATTLIST main attr 1 CDATA #IMPLIED > #FIXED default indicates an attribute has a fixed value, and no other values are acceptable. Since the attribute value is fixed, it does NOT need to be specified in an instance document. <!ATTLIST main attr 1 CDATA #FIXED “Fixed. Value” > Default value supplied. The default value will be used only if no value is supplied by XML document instance. <!ATTLIST main attr 1 CDATA “Default. Value” >
Parsing XML (DOM) ebusiness v DOM parsers read XML into a tree structure of nodes. Node types are shown below: l l l Document. Fragment Document. Type Entity. Reference Element Attr Processing. Instruction Comment Text CDATASection Entity Notation
DOM Element API ebusiness v v v get. Attribute, set. Attribute, remove. Attribute, get. Attribute. Node, set. Attribute. Node, remove. Attribute. Node, has. Attribute get. Attribute. NS, set. Attribute. NS, remove. Attribute. NS, get. Attribute. Node. NS, remove. Attribute. Node. NS, has. Attribute. NS get. Elements. By. Tag. Name, get. Elements. By. Tag. Name. NS
Parsing XML (SAX) ebusiness v SAX parsers generate parsing events that are processed by handlers in an application program. Parsers allow users to plug in custom implementations of the SAX interfaces. The SAX 2. 0 interfaces are: l l l l Attributes Content. Handler DTDHandler Entity. Resolver Error. Handler Locator XMLFilter XMLReader
ebusiness SAX Content. Handler interface v v v characters end. Document end. Element end. Prefix. Mapping ignorable. Whitespace processing. Instruction set. Document. Locator skipped. Entity start. Document start. Element start. Prefix. Mapping
VAST XML parser ebusiness n XML 1. 0 specification è http: //www. w 3. org/TR/1998/REC-xml-19980210 n DOM level-2 core interfaces è http: //www. w 3. org/TR/1998/REC-xml-19980210 n SAX 2. 0 è http: //www. saxproject. org/
Wedding planner DTD ebusiness <!-- 3/10/2001 Wild. And. Wacky. Weddings. com retains information and performs billing for wedding planners. All wedding planners must provide records in the format specified by this DTD. --> <!ELEMENT Wedding. Planner (Address, Phone. Number, Weddings)> <!ATTLIST Wedding. Planner Name NMTOKEN #REQUIRED id ID #REQUIRED> <!ELEMENT Wedding. Planners (Wedding. Planner*) > <!ENTITY % Address. Members 'Street, City, State, Zip' > <!ELEMENT Address (%Address. Members; )> <!ELEMENT Phone. Number (#PCDATA) > <!ELEMENT Street (#PCDATA) > <!ELEMENT City (#PCDATA) > <!ELEMENT State (#PCDATA) > <!ELEMENT Zip (#PCDATA) > <!ELEMENT Weddings (Wedding)* > <!ELEMENT Wedding (Bride, Groom, Date, Time, Ceremony. Location, Reception. Location, Caterer, Number. Of. Guests, Total. Fee, Billing. Address)> <!ATTLIST Wedding id ID #REQUIRED > <!ELEMENT Bride (#PCDATA) > <!ELEMENT Groom (#PCDATA) > <!ELEMENT Billing. Address (%Address. Members; ) > <!ELEMENT Ceremony. Location (Facility. Name, Address) > <!ELEMENT Reception. Location (Facility. Name, Address) > <!ELEMENT Date (#PCDATA) > <!ELEMENT Time (#PCDATA) > <!ELEMENT Caterer (#PCDATA) > <!ELEMENT Number. Of. Guests (#PCDATA) > <!ELEMENT Total. Fee (#PCDATA) > <!ELEMENT Facility. Name (#PCDATA) >
Valid XML document ebusiness <? xml version="1. 0"? > <!DOCTYPE Wedding. Planner SYSTEM "wedding. dtd" > <Wedding. Planner Name="J-Lo" id="Planner_1" > <Address> <Street>29 Oak St. </Street> <City>Raleigh</City> <State>NC</State> <Zip>99999</Zip> </Address> <Phone. Number>555 -4343</Phone. Number> <Weddings> <Wedding id="Ghezzo. G"> <!-- Detailed wedding information removed. . . --> </Weddings> </Wedding. Planner>
ebusiness VAST XML parser example " Validating parser used to read well-formed XML and verify that the contents conform to the DTD referenced in the XML. " | dom. Document dom. Element | dom. Document : = Abt. Xml. DOMParser new. Validating. Parser parse. URI: ‘d: ncsu_2003wedding 1. xml’. dom. Element : = dom. Document get. Element. By. Id: ‘Planner_1’. " Non-validating parser used to read well-formed XML data. " | dom. Document dom. Elements | dom. Document : = Abt. Xml. DOMParser new. Non. Validating. Parser parse. URI: ‘d: ncsu_2003wedding 1. xml’. dom. Elements : = dom. Document get. Elements. By. Tag. Name: ‘Address’.
XML namespaces ebusiness An XML namespace is a collection of names, identified by a URI reference, which are used in XML documents as element types and attribute names. In an XML instance document, items are namespace qualified using a namespace prefix. The reserved word xmlns is used to associate an arbitrary namespace prefix with the actual namespace. Items in the document instance are prefixed to identify the namespace containing their definition.
XML namespace example ebusiness <? xml version="1. 0"? > <vastwsd: deployment target. Namespace="urn: Sst. WSInsurance. Policy. Interface" xmlns: vastwsd="urn: VASTWeb. Service. Deployment 600" xmlns: vast="Smalltalk" xmlns: swsipi="urn: Test"> <services> <service name="Sst. WSInsurance. Policy. Interface" namespace="urn: Test"> <service. Interface. Class>Sst. WSService</service. Interface. Class> <provider type="swsipi: Test. Provider"> <vast: provider class. Name=“Test. Class" creation. Method="new"/> </provider> </services> </vastwsd: deployment>
XML schema ebusiness 1. 2. XML schema improves on XML DTD 1. XML schemas are coded in XML 2. XML schema includes type information allowing object models to be represented The W 3 C XML Schema Primer is a great resource for basic information about XML schema. (http: //www. w 3. org/TR/xmlschema-0)
<schema> ebusiness § The top-level element of an XML schema document. The schema element typically includes the namespace associations required by the schema. <xs: schema xmlns: xsd="http: //www. w 3. org/2001/XMLSchema" xmlns: tns="http: //schemas. xmlsoap. org/soap/envelope/" target. Namespace="http: //schemas. xmlsoap. org/soap/envelope/" >
<element> ebusiness § An “instance” of a schema type. An element in XML schema is much like a variable declaration in typed languages like Java. <xsd: element name=“field 1" type="xsd: QName" /> <xsd: element name=“field 2" type="xsd: string" min. Occurs=“ 0” /> <xsd: element name=“field 3" type="xsd: string” max. Occurs=“unbounded” /> <xsd: element name=“field 4" type="xsd: QName" nillable=“true” /> <xsd: element name=“field 5" type="xsd: string" default=“Test. Value” /> <xsd: element name=“field 6" ref=“tns: field 1” /> <xsd: element name=“field 7" type=“xsd: string” form=“qualified” />
<attribute> ebusiness § Used to represent simple values associated with an XML element. Items represented as attributes cannot contain other attributes or elements. <xsd: attribute name=“field 1" type="xsd: QName" /> <xsd: attribute name=“field 2" type="xsd: string" min. Occurs=“ 0” /> <my. Element field 1=“tns: Foo. Bar” field 2=“test. String” />
<simple. Type> ebusiness § Used to describe the content of XML elements that contain simple data, but no subelements or attributes. Below is a list of simple. Types that are defined in the base XML schema (http: //www. w 3. org/2001/XMLSchema). Some of the base types are derived from other types. string, normalized. String, token, byte, unsigned. Byte, base 64 Binary, hex. Binary, integer, positive. Integer, negative. Integer, non. Negative. Integer, non. Positive. Integer, int, unsigned. Int, long, unsigned. Long, short, unsigned. Short, decimal, float, double, boolean, date. Time, duration, date, g. Month, g. Year, g. Day, g. Month. Day, Name, Qname, NCName, any. URI, language, IDREF, IDREFS, ENTITY, ENTITIES, NOTATION, NMTOKENS
<restriction> ebusiness § Used to define a new schema type by supplying constraints (restrictions) for an existing schema type. <xsd: simple. Type name=“custom. Integer” > <xsd: restriction base=“xsd: integer” > <xsd: min. Inclusive value=“ 100” /> <xsd: max. Inclusive value=“ 1000” /> </xsd: restriction> </xsd: simple. Type>
<extension> ebusiness § Used to derive a new schema type by extending the properties of an existing type (analogous to subclass).
<extension> example ebusiness <complex. Type name="Address"> <sequence> <element name="name" type="string"/> <element name="street" type="string"/> <element name="city" type="string"/> </sequence> </complex. Type> <complex. Type name="USAddress"> <complex. Content> <extension base="ipo: Address"> <sequence> <element name="state" type="ipo: USState"/> <element name="zip" type="positive. Integer"/> </sequence> </extension> </complex. Content> </complex. Type>
<enumeration> ebusiness § Used to provide a list of valid values for an extended type. <xsd: simple. Type name="USState"> <xsd: restriction base="xsd: string"> <xsd: enumeration value="AK"/> <xsd: enumeration value="AL"/> <xsd: enumeration value="AR"/> <!-- and so on. . . --> </xsd: restriction> </xsd: simple. Type>
<complex. Type> ebusiness § Used to describe XML elements that can contain attributes and subelements. <xsd: complex. Type name="deployment. Type"> <xsd: sequence> <xsd: element min. Occurs="0" name="container" type="tns: container. Type" /> <xsd: element min. Occurs="0" name="services" type="tns: services. Type" /> </xsd: sequence> <xsd: attribute name="target. Namespace" type="xsd: any. URI" /> </xsd: complex. Type>
<sequence> ebusiness § Used to specify a group of elements that must appear in an instance document in the same order that they are defined in the schema type definition. <xsd: sequence> <xsd: element name=“field 1" type=“xsd: string" /> <xsd: element name=“field 2" type=“xsd: string" /> </xsd: sequence>
<choice> ebusiness § Used to specify that one element or element group out of potentially many will be included in a document instance. <xsd: choice> <xsd: element name=“field 1" type=“xsd: string" /> <xsd: element name=“field 2" type=“xsd: string" /> </xsd: choice>
<all> ebusiness § Used to specify a group of elements that may appear once or not at all in an instance document, and the elements may appear in any order. <xsd: all> <xsd: element name=“field 1" type=“xsd: string" /> <xsd: element name=“field 2" type=“xsd: string" /> </xsd: all>
<import> ebusiness § Used to specify a namespace that is referenced by one or more declarations in the schema being defined. An import may specify the schema. Location from which the namespace definitions can be retrieved. <xsd: import namespace=“urn: My. Other. Namespace" schema. Location="http: //www. myserver. com/otherns. xsd" />
<include> ebusiness § Used to pull in definitions from an external resource. The definitions must be in the same namespace as the schema where the <include> is specified. <xsd: include schema. Location=“moredefinitions. xsd" />
Other schema tags ebusiness v annotation, app. Info, attribute. Group, complex. Content, documentation, field, group, keyref, length, list, max. Inclusive, max. Length, min. Inclusive, min. Length, pattern, redefine, selector, simple. Content, union, unique
Special attributes (xsi: nil) ebusiness Used in an XML instance document to indicate that the value of an element is nil Given the following schema definition: <element name="my. Value" type="xsd: string" nillable="true" /> <!-- The presumed value of the 'my. Value' element below is the empty string --> <my. Value></my. Value> <!-- The presumed value of the 'my. Value' element below is nil --> <my. Value xsi: nil="true"></my. Value>
ebusiness Special attributes (xsi: type) Used to enable usage of a derived type where the base type is expected. <!– Schema definition for bill. To specifies type ‘Address’ <bill. To xsi: type="ipo: USAddress"> <name>Robert Smith</name> <street>8 Oak Avenue</street> <city>Old Town</city> <state>PA</state> <zip>95819</zip> </bill. To>
Wedding planner schema ebusiness <!-- Schema for wedding planner app. This schema is directly dervied from the wedding. dtd file --> <xsd: schema xmlns: xsd="http: //www. w 3. org/2001/XMLSchema" xmlns: tns="urn: Wedding. Planner" target. Namespace="urn: Wedding. Planner" > <xsd: complex. Type name="Wedding"> <xsd: sequence> <xsd: element name="Bride" type="xsd: string" /> <xsd: element name="Groom" type="xsd: string" /> <xsd: element name="Date" type="xsd: string" /> <xsd: element name="Time" type="xsd: string" /> <xsd: element name="Ceremony. Location" type="tns: Location" /> <xsd: element name="Reception. Location" type="tns: Location" /> <xsd: element name="Caterer" type="xsd: string" /> <xsd: element name="Number. Of. Guests" type="xsd: int" /> <xsd: element name="Total. Fee" type="xsd: decimal" /> <xsd: element name="Billing. Address" type="tns: Address"/> </xsd: sequence> <xsd: attribute name="Name" type="tns: string" /> </xsd: complex. Type> <!-- Remainder of schema not included in order to save space </xsd: schema>
XSL (XML Stylesheet Language ) ebusiness v v Enables separation of data content and format Enables standardized style of presentation Customizable based upon individual preferences XSL stylesheets are declarative. Each instruction tells the processor “what” to perform in contrast to imperative languages that tell the processor “how” to perform.
XML Transformations ebusiness v v v Great For Interoperability Problems Transforms Data From A Source Data Format To A Target Format Source Is XML, Target Is Some Kind Of Text Format Target Can Be XML XSLT Is Used For Transformations Can Exploit Coalescing Around A Standard
XML Transformations ebusiness XML + XSLT = HTML XML + XSLT = XHTML XML + XSLT = Text XML + XSLT = SVG (Picture) XML + XSLT = Whatever (Non-binary)
XML Transformations ebusiness
XSL element names ebusiness v v v v xsl: stylesheet xsl: template xsl: apply-templates xsl: comment xsl: pi xsl: element xsl: attribute xsl: value-of xsl: for-each xsl: if xsl: choose xsl: when xsl: otherwise xsl: copy
XSL example ebusiness <!-- Only part of this XSL stylesheet is shown here due to space constraints <xsl: template match="football_game"> <html> <head><title>Game results</title></head> <body bgcolor="#ffffff" text="#000000"> <table width="100%" border="1" cellspacing="0" cellpadding="4"> <th align="left">Team name</th> <th align="left">Team score</th> <xsl: apply-templates select="home"/> <xsl: apply-templates select="visitor"/> </table> </body> </html> </xsl: template> <xsl: template match="home"> <tr> <td><b><xsl: value-of select="/football_game/home/school"/></b></td> <td><b><xsl: value-of select="/football_game/home/score"/></b></td> </tr> </xsl: template>
Open source downloads ebusiness v v Xalan XSLT processor (for Java or C++) http: //xml. apache. org/ Xerces XML validating parser (for Java or C++) http: //xml. apache. org/
- Slides: 48