Introduction to XML Schema Do D Users Group




















- Slides: 20

Introduction to XML Schema Do. D Users Group Tutorial on XML and Science June 10 2002 Austin TX Geoffrey Fox and Bryan Carpenter PTLIU Laboratory for Community Grids Informatics, (Computer Science , Physics) Indiana University Bloomington IN 47404 gcf@indiana. edu 6/8/2021 xmlschemaugcjune 02 1

Outline • Our XML Schema discussion are based on excellent W 3 C document http: //www. w 3. org/TR/xmlschema-0/ – Note that this is reasonably complete – parts 1 and 2 (http: //www. w 3. org/TR/xmlschema-1/ etc. ) are more formal discussions not extensions in coverage – See http: //www. w 3 schools. com/schema/ (elementary) and note very complete tutorial and examples at http: //www. xfront. com/ • We only summarize the following topics in this primer – Basic Schema – XML Types – Namespaces • The important topics below are NOT discussed – Groups ** REMOVED – Include ** REMOVED – Derived types ** REMOVED – Import ** REMOVED • Remember: Schema and DTD are like classes in Java – XML files are instances of objects defined in Schema or DTD’s 6/8/2021 xmlschemaugcjune 02 2

XML Schema PO Example po. xml I • <? xml version="1. 0"? > • <purchase. Order order. Date="1999 -10 -20"> • <ship. To country="US"> • <name>Alice Smith</name> • <street>123 Maple Street</street> • <city>Mill Valley</city> • <state>CA</state> • <zip>90952</zip> • </ship. To> • <bill. To country="US"> • <name>Robert Smith</name> • <street>8 Oak Avenue</street> • <city>Old Town</city> • <state>PA</state> • <zip>95819</zip> • </bill. To> 6/8/2021 xmlschemaugcjune 02 Need to add mechanism (namelist) to associate schema with this XML instance 3

XML Schema PO Example po. xml II • <comment>Hurry, my lawn is going wild!</comment> • <items> • <item part. Num="872 -AA"> • <product. Name>Lawnmower</product. Name> • <quantity>1</quantity> • <USPrice>148. 95</USPrice> • <comment>Confirm this is electric</comment> • </item> • <item part. Num="926 -AA"> • <product. Name>Baby Monitor</product. Name> • <quantity>1</quantity> • <USPrice>39. 98</USPrice> • <ship. Date>1999 -05 -21</ship. Date> • </items> • </purchase. Order> 6/8/2021 xmlschemaugcjune 02 4

The Purchase Order Schema, po. xsd I • <xsd: schema xmlns: xsd="http: //www. w 3. org/2000/08/XMLSchema"> • <xsd: annotation> • <xsd: documentation> • Purchase order schema for Example. com. • Copyright 2000 Example. com. All rights reserved. • </xsd: documentation> • </xsd: annotation> • <xsd: element name="purchase. Order" type="Purchase. Order. Type"/> • <xsd: element name="comment" type="xsd: string"/> • <xsd: complex. Type name="Purchase. Order. Type"> • <xsd: sequence> • <xsd: element name="ship. To" type="USAddress"/> • <xsd: element name="bill. To" type="USAddress"/> • <xsd: element ref="comment" min. Occurs="0"/> • <xsd: element name="items" type="Items"/> • </xsd: sequence> • <xsd: attribute name="order. Date" type="xsd: date"/> • </xsd: complex. Type> 6/8/2021 xmlschemaugcjune 02 5

The Purchase Order Schema, po. xsd II • <xsd: complex. Type name="USAddress"> • <xsd: sequence> • <xsd: element name="name" type="xsd: string"/> • <xsd: element name="street" type="xsd: string"/> • <xsd: element name="city" type="xsd: string"/> • <xsd: element name="state" type="xsd: string"/> • <xsd: element name="zip" type="xsd: decimal"/> • </xsd: sequence> • <xsd: attribute name="country" type="xsd: NMTOKEN" • use="fixed" value="US"/> • </xsd: complex. Type> 6/8/2021 xmlschemaugcjune 02 6

• • • • • The Purchase Order Schema, po. xsd III <xsd: complex. Type name="Items"> <xsd: sequence> <xsd: element name="item" min. Occurs="0" max. Occurs="unbounded"> <xsd: complex. Type> <xsd: sequence> <xsd: element name="product. Name" type="xsd: string"/> <xsd: element name="quantity"> <xsd: simple. Type> Anonymous type <xsd: restriction base="xsd: positive. Integer"> quantity <xsd: max. Exclusive value="100"/> </xsd: restriction> </xsd: simple. Type> Specify </xsd: element> item <xsd: element name="USPrice" type="xsd: decimal"/> <xsd: element ref="comment" min. Occurs="0"/> <xsd: element name="ship. Date" type="xsd: date" min. Occurs="0"/> </xsd: sequence> <xsd: attribute name="part. Num" type="SKU"/> </xsd: complex. Type> 6/8/2021 xmlschemaugcjune 02 7

The Purchase Order Schema, po. xsd IV • </xsd: element> <!– End item specification • </xsd: sequence> <!– End sequence for items specification • </xsd: complex. Type> <!– End items specification • <!-- Stock Keeping Unit, a code for identifying products --> • <xsd: simple. Type name="SKU"> • <xsd: restriction base="xsd: string"> • <xsd: pattern value="d{3}-[A-Z]{2}"/> • </xsd: restriction> • </xsd: simple. Type> • </xsd: schema> 6/8/2021 xmlschemaugcjune 02 8

Comments on po. xsd I • The prefix xsd: is Namespace specified by xmlns: xsd=“http: //www. w 3. org/2000/08/XMLSchema” – The label xsd is conventional; you could use a different one • Schema does two types of things: – Defines new types of elements or attributes e. g. Purchase. Order. Type with xsd: complex. Type or xsd: simple. Type – Defines internal elements and attributes for tags e. g. Purchase. Order using xsd: element and xsd: attribute So an element like ship. To in po. xml has at most one attribute country which if it appears must have value US. The elements name, street, city, state, zip must appear in this order and the first four are strings; zip is a xsd: decimal 6/8/2021 xmlschemaugcjune 02 9

Comments on po. xsd II • The prefix xsd: is Namespace specified by xmlns: xsd=http: //www. w 3. org/2000/08/XMLSchema – The label xsd is conventional; you could use a different one 6/8/2021 xmlschemaugcjune 02 10

Comments on po. xsd III • Schema does two types of things: – Defines new types of elements or attributes e. g. Purchase. Order. Type with xsd: complex. Type – Defines internal elements and attributes for tags e. g. Purchase. Order using xsd: element and xsd: attribute 6/8/2021 xmlschemaugcjune 02 11

Comments on po. xsd IV • So an element like ship. To in po. xml has at most one attribute country which if it appears must have value US. The elements name, street, city, state, zip must appear in this order and the first four are strings; zip is a xsd: decimal 6/8/2021 xmlschemaugcjune 02 12

xsd: complex. Type Example • Anything of Purchase. Order. Type is allowed one attribute (order. Date) of type xsd: date. – Further it must consists of 4 elements ship. To, bill. To, comment and items in that order – comment can be absent as min. Occurs = 0 • Note that comment is a global element defined in the schema. It is accessed by ref= rather than type= • One can define global elements or attributes which cannot themselves use ref (global elements are particularly important if one imports this schema into other schema as only global components can be re-used) • min. Occurs and max. Occurs have default values of 1 and so ship. To, bill. To and items must appear once and once only 6/8/2021 xmlschemaugcjune 02 13

Specifying xsd: element and xsd: attribute • Any xsd: element tag can have attribute min. Occurs, max. Occurs, fixed and default • Any xsd: attribute can have attribute use and value • Note attributes can NEVER appear more than once - means not present 6/8/2021 xmlschemaugcjune 02 14

Simple Types • Attributes must be Simple Types; Elements can be complex types or simple types • Simple types cannot themselves have attributes or contain other elements • The xsd: restriction tag allows you to build new simple types by adding constraints to existing simple types. These constraints are specified by a set of “constraining facets” xsd: min. Inclusive and xsd: max. Inclusive restrict 10000 <= my. Integer <= 99999 xsd: pattern restricts using a regular expression to 3 single digits followed by two capital letters 6/8/2021 xmlschemaugcjune 02 15

Constraining Facets • These are defined in http: //www. w 3. org/TR/2000/CR-xmlschema 2 -20001024/ and are • length, min. Length, max. Length, pattern, enumeration, white. Space, max. Inclusive, max. Exclusive, min. Inclusive, precision, scale, encoding, duration, period • Tables tell you which constraining facets can be used with which simple type • enumeration can be used with all simple types except boolean. In example below, you could define a USState simple type which could take any of conventional abbreviations of the US States 6/8/2021 xmlschemaugcjune 02 16

• • • • • Simple Types Built In to XML Schema I Simple Type Examples (delimited by commas) Red is Note string Confirm this is electric CDATA Confirm this is electric (white space(tabs) to blanks etc. ) token Confirm this is electric (trailing/leading white space removed) byte -1, 126 unsigned. Byte 0, 126 binary 62696 E 617279 integer -126789, -1, 0, 1, 126789 Lots of Types but no positive. Integer 1, 126789 complex or Array negative. Integer -126789, -1 non. Negative. Integer 0, 1, 126789 Extensions of XML non. Positive. Integer -126789, -1, 0 such as SOAP (W 3 C) int -1, 126789675 or XSIL (HPCC) unsigned. Int 0, 1267896754 define additional types long -1, 12678967543233 unsigned. Long 0, 12678967543233 short -1, 12678 unsigned. Short 0, 12678 6/8/2021 xmlschemaugcjune 02 17

Simple Types Built In to XML Schema II • • • • decimal -1. 23, 0, 123. 4, 1000. 00 float -INF, -1 E 4, -0, 0, 12. 78 E-2, 12, INF, Na. N double -INF, -1 E 4, -0, 0, 12. 78 E-2, 12, INF, Na. N Red is Note boolean true, false time 13: 20: 00. 000, 13: 20: 00. 000 -05: 00 time. Instant 1999 -05 -31 T 13: 20: 00. 000 -05: 00 [May 31 st 1999 at 1. 20 pm Eastern Standard Time which is 5 hours behind Coordinated Universal Time] time. Period 1999 -05 -31 T 13: 20 time. Duration P 1 Y 2 M 3 DT 10 H 30 M 12. 3 S [1 year, 2 months, 3 days, 10 hours, 30 minutes, 12. 3 seconds] date 1999 -05 -31 month 1999 -05 [May 1999] year 1999 [1999] century 19 [the 1900's] recurring. Day ----31 [every 31 st day] recurring. Date --05 -31 [every May 31 st] recurring. Duration --05 -31 T 13: 20: 00 [May 31 st every year at 1. 20 pm Coordinated Universal Time, format similar to time. Instant] 6/8/2021 xmlschemaugcjune 02 18

Simple Types Built In to XML Schema III • • • • Simple Type Examples (delimited by commas) Notes in [] Name ship. To [XML 1. 0 Name type] QName po: USAddress [XML Namespace QName] NCName USAddress [XML Namespace NCName, i. e. a QName without the prefix and colon] uri. Reference http: //www. example. com/, http: //www. example. com/doc. html#ID 5 language en-GB, en-US, fr [valid values for xml: lang as defined in XML 1. 0] ID [XML 1. 0 ID attribute type] IDREF [XML 1. 0 IDREF attribute type] IDREFS [XML 1. 0 IDREFS attribute type] ENTITY [XML 1. 0 ENTITY attribute type] ENTITIES [XML 1. 0 ENTITIES attribute type] NOTATION [XML 1. 0 NOTATION attribute type] NMTOKEN US, Brésil [XML 1. 0 NMTOKEN attribute type] NMTOKENS US UK, Brésil Canada Mexique [XML 1. 0 NMTOKENS attribute type, i. e. a whitespace separated list of NMTOKEN's] 6/8/2021 xmlschemaugcjune 02 19

Schema and Namespaces • Schema do a better job than DTD in making it clear how Namespaces can be used effectively • Here we define http: //www. example. com/PO 1 as the target. Namespace for this schema • We also define po: as the same URL and the default Namespace to be W 3 C Schema central • Then we do NOT need xsd: as in original po. xml as we have set Schema central to be default Namespace po: • In name=‘. . ’, we do NOT need to specify a Namespace but in type=“. . ” or ref=“. . ” we do not as these could reference another Namespace 6/8/2021 No Namespace as string type define in Schema Central xmlschemaugcjune 02 20