XML Schema Definition Language XSD XML Schemas n

  • Slides: 29
Download presentation
XML Schema Definition Language (XSD)

XML Schema Definition Language (XSD)

XML Schemas n “Schemas” is a general term--DTDs are a form of XML schemas

XML Schemas n “Schemas” is a general term--DTDs are a form of XML schemas n n When we say “XML Schemas, ” we usually mean the W 3 C XML Schema Language n n n According to the dictionary, a schema is “a structured framework or plan” This is also known as “XML Schema Definition” language, or XSD I’ll use “XSD” frequently, because it’s short DTDs, XML Schemas, and RELAX NG are all XML schema languages 2

Why XML Schemas? n DTDs provide a very weak specification language n n DTDs

Why XML Schemas? n DTDs provide a very weak specification language n n DTDs are written in a strange (non-XML) format n n You can’t put any restrictions on text content You have very little control over mixed content (text plus elements) You have little control over ordering of elements You need separate parsers for DTDs and XML The XML Schema Definition language solves these problems n n XSD gives you much more control over structure and content XSD is written in XML 3

Why not XML schemas? n DTDs have been around longer than XSD n n

Why not XML schemas? n DTDs have been around longer than XSD n n Therefore they are more widely used Also, more tools support them n XSD is very verbose, even by XML standards More advanced XML Schema instructions can be nonintuitive and confusing n Nevertheless, XSD is not likely to go away quickly n 4

Referring to a schema n To refer to a DTD in an XML document,

Referring to a schema n To refer to a DTD in an XML document, the reference goes before the root element: n n <? xml version="1. 0"? > <!DOCTYPE root. Element SYSTEM "url"> <root. Element>. . . </root. Element> To refer to an XML Schema in an XML document, the reference goes in the root element: n <? xml version="1. 0"? > <root. Element xmlns: xsi="http: //www. w 3. org/2001/XMLSchema-instance" (The XML Schema Instance reference is required) xsi: no. Namespace. Schema. Location="url. xsd"> (This is where your XML Schema definition can be found). . . </root. Element> 5

The XSD document n n n Since the XSD is written in XML, it

The XSD document n n n Since the XSD is written in XML, it can get confusing which we are talking about Except for the additions to the root element of our XML data document, the rest of this lecture is about the XSD schema document The file extension is. xsd The root element is <schema> The XSD starts like this: n <? xml version="1. 0"? > <xs: schema xmlns: xs="http: //www. w 3. rg/2001/XMLSchema"> 6

<schema> n The <schema> element may have attributes: n xmlns: xs="http: //www. w 3.

<schema> n The <schema> element may have attributes: n xmlns: xs="http: //www. w 3. org/2001/XMLSchema" n n This is necessary to specify where all our XSD tags are defined element. Form. Default="qualified" n n This means that all XML elements must be qualified (use a namespace) It is highly desirable to qualify all elements, or problems will arise when another schema is added 7

“Simple” and “complex” elements n A “simple” element is one that contains text and

“Simple” and “complex” elements n A “simple” element is one that contains text and nothing else n n n A simple element cannot have attributes A simple element cannot contain other elements A simple element cannot be empty However, the text can be of many different types, and may have various restrictions applied to it If an element isn’t simple, it’s “complex” n n A complex element may have attributes A complex element may be empty, or it may contain text, other elements, or both text and other elements 8

Defining a simple element n A simple element is defined as <xs: element name="name"

Defining a simple element n A simple element is defined as <xs: element name="name" type="type" /> where: n name is the name of the element the most common values for type are xs: boolean xs: integer xs: date xs: string xs: decimal xs: time Other attributes a simple element may have: n n default="default value" fixed="value" if no other value is specified no other value may be specified 9

Defining an attribute n n Attributes themselves are always declared as simple types An

Defining an attribute n n Attributes themselves are always declared as simple types An attribute is defined as <xs: attribute name="name" type="type" /> where: n n name and type are the same as for xs: element Other attributes a simple element may have: n n default="default value" if no other value is specified fixed="value" no other value may be specified use="optional" the attribute is not required (default) use="required" the attribute must be present 10

Restrictions, or “facets” n The general form for putting a restriction on a text

Restrictions, or “facets” n The general form for putting a restriction on a text value is: n n <xs: element name="name"> <xs: restriction base="type">. . . the restrictions. . . </xs: restriction> </xs: element> (or xs: attribute) For example: n <xs: element name="age"> <xs: restriction base="xs: integer"> <xs: min. Inclusive value="0"> <xs: max. Inclusive value="140"> </xs: restriction> </xs: element> 11

Restrictions on numbers n min. Inclusive -- number must be ≥ the given value

Restrictions on numbers n min. Inclusive -- number must be ≥ the given value n min. Exclusive -- number must be > the given value n max. Inclusive -- number must be ≤ the given value n max. Exclusive -- number must be < the given value n total. Digits -- number must have exactly value digits n fraction. Digits -- number must have no more than value digits after the decimal point 12

Restrictions on strings n n n length -- the string must contain exactly value

Restrictions on strings n n n length -- the string must contain exactly value characters min. Length -- the string must contain at least value characters max. Length -- the string must contain no more than value characters pattern -- the value is a regular expression that the string must match white. Space -- not really a “restriction”--tells what to do with whitespace n value="preserve" Keep all whitespace n value="replace" Change all whitespace characters to spaces n value="collapse" Remove leading and trailing whitespace, and replace all sequences of whitespace with a single space 13

Enumeration n n An enumeration restricts the value to be one of a fixed

Enumeration n n An enumeration restricts the value to be one of a fixed set of values Example: n <xs: element name="season"> <xs: simple. Type> <xs: restriction base="xs: string"> <xs: enumeration value="Spring"/> <xs: enumeration value="Summer"/> <xs: enumeration value="Autumn"/> <xs: enumeration value="Fall"/> <xs: enumeration value="Winter"/> </xs: restriction> </xs: simple. Type> </xs: element> 14

Complex elements n A complex element is defined as <xs: element name="name"> <xs: complex.

Complex elements n A complex element is defined as <xs: element name="name"> <xs: complex. Type>. . . information about the complex type. . . </xs: complex. Type> </xs: element> n Example: <xs: element name="person"> <xs: complex. Type> <xs: sequence> <xs: element name="first. Name" type="xs: string" /> <xs: element name="last. Name" type="xs: string" /> </xs: sequence> </xs: complex. Type> </xs: element> n n <xs: sequence> says that elements must occur in this order Remember that attributes are always simple types 15

Global and local definitions n n Elements declared at the “top level” of a

Global and local definitions n n Elements declared at the “top level” of a <schema> are available for use throughout the schema Elements declared within a xs: complex. Type are local to that type Thus, in <xs: element name="person"> <xs: complex. Type> <xs: sequence> <xs: element name="first. Name" type="xs: string" /> <xs: element name="last. Name" type="xs: string" /> </xs: sequence> </xs: complex. Type> </xs: element> the elements first. Name and last. Name are only locally declared The order of declarations at the “top level” of a <schema> do not specify the order in the XML data document 16

Declaration and use n n So far we’ve been talking about how to declare

Declaration and use n n So far we’ve been talking about how to declare types, not how to use them To use a type we have declared, use it as the value of type=". . . " n Examples: n n n <xs: element name="student" type="person"/> <xs: element name="professor" type="person"/> Scope is important: you cannot use a type if is local to some other type 17

xs: sequence n n We’ve already seen an example of a complex type whose

xs: sequence n n We’ve already seen an example of a complex type whose elements must occur in a specific order: <xs: element name="person"> <xs: complex. Type> <xs: sequence> <xs: element name="first. Name" type="xs: string" /> <xs: element name="last. Name" type="xs: string" /> </xs: sequence> </xs: complex. Type> </xs: element> 18

xs: all n n xs: allows elements to appear in any order <xs: element

xs: all n n xs: allows elements to appear in any order <xs: element name="person"> <xs: complex. Type> <xs: all> <xs: element name="first. Name" type="xs: string" /> <xs: element name="last. Name" type="xs: string" /> </xs: all> </xs: complex. Type> </xs: element> Despite the name, the members of an xs: all group can occur once or not at all You can use min. Occurs="0" to specify that an element is optional (default value is 1) n In this context, max. Occurs is always 1 19

Referencing n n Once you have defined an element or attribute (with name=". .

Referencing n n Once you have defined an element or attribute (with name=". . . "), you can refer to it with ref=". . . " Example: n n n <xs: element name="person"> <xs: complex. Type> <xs: all> <xs: element name="first. Name" type="xs: string" /> <xs: element name="last. Name" type="xs: string" /> </xs: all> </xs: complex. Type> </xs: element> <xs: element name="student" ref="person"> Or just: <xs: element ref="person"> 20

Text element with attributes n If a text element has attributes, it is no

Text element with attributes n If a text element has attributes, it is no longer a simple type n <xs: element name="population"> <xs: complex. Type> <xs: simple. Content> <xs: extension base="xs: integer"> <xs: attribute name="year” type="xs: integer"> </xs: extension> </xs: simple. Content> </xs: complex. Type> </xs: element> 21

Empty elements n n Empty elements are (ridiculously) complex <xs: complex. Type name="counter"> <xs:

Empty elements n n Empty elements are (ridiculously) complex <xs: complex. Type name="counter"> <xs: complex. Content> <xs: extension base="xs: any. Type"/> <xs: attribute name="count" type="xs: integer"/> </xs: complex. Content> </xs: complex. Type> 22

Mixed elements n n Mixed elements may contain both text and elements We add

Mixed elements n n Mixed elements may contain both text and elements We add mixed="true" to the xs: complex. Type element The text itself is not mentioned in the element, and may go anywhere (it is basically ignored) <xs: complex. Type name="paragraph" mixed="true"> <xs: sequence> <xs: element name="some. Name” type="xs: any. Type"/> </xs: sequence> </xs: complex. Type> 23

Extensions n n You can base a complex type on another complex type <xs:

Extensions n n You can base a complex type on another complex type <xs: complex. Type name="new. Type"> <xs: complex. Content> <xs: extension base="other. Type">. . . new stuff. . . </xs: extension> </xs: complex. Content> </xs: complex. Type> 24

Predefined string types n n Recall that a simple element is defined as: <xs:

Predefined string types n n Recall that a simple element is defined as: <xs: element name="name" type="type" /> Here a few of the possible string types: n n xs: string -- a string xs: normalized. String -- a string that doesn’t contain tabs, newlines, or carriage returns xs: token -- a string that doesn’t contain any whitespace other than single spaces Allowable restrictions on strings: n enumeration, length, max. Length, min. Length, pattern, white. Space 25

Predefined date and time types n n n xs: date -- A date in

Predefined date and time types n n n xs: date -- A date in the format CCYY-MM-DD, for example, 2002 -11 -05 xs: time -- A date in the format hh: mm: ss (hours, minutes, seconds) xs: date. Time -- Format is CCYY-MM-DDThh: mm: ss n n The T is part of the syntax Allowable restrictions on dates and times: n enumeration, min. Inclusive, min. Exclusive, max. Inclusive, max. Exclusive, pattern, white. Space 26

Predefined numeric types n Here are some of the predefined numeric types: xs: decimal

Predefined numeric types n Here are some of the predefined numeric types: xs: decimal xs: byte xs: short xs: int xs: long n xs: positive. Integer xs: negative. Integer xs: non. Positive. Integer xs: non. Negative. Integer Allowable restrictions on numeric types: n enumeration, min. Inclusive, min. Exclusive, max. Inclusive, max. Exclusive, fraction. Digits, total. Digits, pattern, white. Space 27

Opinion n If you like C++, you’ll love XSD n n n If you

Opinion n If you like C++, you’ll love XSD n n n If you dislike debugging, you’ll hate XSD n n Enjoy the feeling of knowing something arcane that isn’t understood by lesser mortals If you hope to rule the world, you’ll need a schema language with this much power XSD is complex and error-prone, with lots of gotchas Be prepared n n XSD is a W 3 C standard, which makes it important If you work with XML, you may have to use XSD I hope this brief introduction will make it easier to get started IMHO, XSD is one of the reasons that DTDs are still so popular 28

The End 29

The End 29