XML for NET Session 1 Introduction to XML

  • Slides: 74
Download presentation
XML for. NET Session 1 Introduction to XML Introduction to XSLT Programmatically Reading XML

XML for. NET Session 1 Introduction to XML Introduction to XSLT Programmatically Reading XML Documents Introduction to XPATH

XML Documents Can be Read Programmatically n n The. NET Framework consists of many

XML Documents Can be Read Programmatically n n The. NET Framework consists of many classes to aid in programmatically iterating through and navigating XML documents. These classes are found in the System. Xml namespace. The various classes in the System. Xml namespace are highlighted in Chapter 6 of the text, XML and ASP. NET (starting on page. 261).

Accessing XML Content n n n XML documents can be accessed in one of

Accessing XML Content n n n XML documents can be accessed in one of two ways: in a push model or a pull model. The pull model loads the entire XML document into memory, and then works with the document once it has been completely loaded. The push model accesses only tiny pieces of the XML document when needed.

Comparing and Contrasting Push and Pull Approaches Pull Model Pluses n Minuses n Quickly

Comparing and Contrasting Push and Pull Approaches Pull Model Pluses n Minuses n Quickly iterate and navigate through XML content once it’s fully loaded. Push Model Allows for navigation and iteration of very large XML files. n Requires that the n. Difficult to add entire XML document be and update loaded into memory; elements in the does not scale to large XML document. XML content or large number of users.

How to use the Two Methods n The. NET Framework provides developers both methods:

How to use the Two Methods n The. NET Framework provides developers both methods: n n Pull Method – use the DOM classes in the. NET Framework. Push Method – use the Xml. Reader and Xml. Writer classes.

Using the Pull Method n The System. Xml namespace contains a number of classes

Using the Pull Method n The System. Xml namespace contains a number of classes to work with XML documents in the DOM paradigm: n n Xml. Document – represents an XML document. Xml. Element – represents an individual element in the DOM Xml. Attribute – represents an attribute. Xml. Text – represents text content.

Using the Push Method n n The Xml. Reader reads one node at a

Using the Push Method n n The Xml. Reader reads one node at a time from a specified XML source. The Xml. Reader can only read in a FORWARD direction. The Xml. Reader class cannot be used directly; instead, one of its derived classes must be used instead: n n n Xml. Node. Reader – reads one node at a time from an XML DOM. Xml. Text. Reader – reads one node at a time from an XML source, such as a file with XML content. Xml. Validating. Reader – a reader that performs DTD or schema validation (more on this next week!)

Iterating through an XML Document using Xml. Text. Reader n To iterate through the

Iterating through an XML Document using Xml. Text. Reader n To iterate through the contents of an XML document with the Xml. Text. Reader we need to: 1. 2. 3. Specify the XML document to iterate through when creating the Xml. Text. Reader. Call the Read() method, which reads in the next Node. Access the properties of the Xml. Text. Reader to determine the name, value, and other information about the read Node.

Iterating through an XML Document using Xml. Text. Reader n We can programmatically read

Iterating through an XML Document using Xml. Text. Reader n We can programmatically read through the contents of an XML file like so: // create an Xml. Text. Reader to read the specified XML file Xml. Text. Reader reader = new Xml. Text. Reader(filepath); // now, display the information of each node in the Text. Box while (reader. Read()) { // access the properties of the Xml. Text. Reader class. . . // like reader. Name, reader. Node. Type, reader. Value, etc. } // close the Xml. Text. Reader reader. Close();

What is a Node? n Recall that the Xml. Reader classes read XML nodes.

What is a Node? n Recall that the Xml. Reader classes read XML nodes. What constitutes a node? Can you identify the nodes in the following XML fragment? <? xml version=“ 1. 0” encoding=“utf-8” ? > <books> <book price=“ 34. 95”> <title>Animal Farm</title> <authors> <author>Orwell</author> </authors> </books>

What is a Node? <? xml version=“ 1. 0” encoding=“utf-8” ? > <books> <book

What is a Node? <? xml version=“ 1. 0” encoding=“utf-8” ? > <books> <book price=“ 34. 95”> <title>Animal Farm</title> <authors> <author>Orwell</author> </authors> </book> The whitespace between each </books> element (if present) is also considered a node! (Although, you can set the Xml. Text. Reader’s Whitespace. Handling property to specify if the Reader should read whitespace nodes or not.

What is a Node? <? xml version=“ 1. 0” encoding=“utf-8” ? > <books> <book

What is a Node? <? xml version=“ 1. 0” encoding=“utf-8” ? > <books> <book price=“ 34. 95”> <title>Animal Farm</title> <authors> <author>Orwell</author> </authors> </books> Notice that the attributes of an element are not considered nodes. . .

Creating a Program to View the Content Read by an Xml. Text. Reader n

Creating a Program to View the Content Read by an Xml. Text. Reader n We can create a program that allows the user to select an XML file; then, the contents of the XML file are read by an Xml. Text. Reader, with each read node’s name, type, and value displayed. (Run demo!)

Reading the Attributes n n n As we saw in the demo, the attributes

Reading the Attributes n n n As we saw in the demo, the attributes are not read as a separate node. We can determine whether or not a given node has attributes by the Has. Attributes property. In order to programmatically access the attributes of a node, we must use the Move. To. Next. Attribute() method of the Xml. Text. Reader.

Reading the Attributes while (reader. Read()) // C# { if (reader. Has. Attributes) while

Reading the Attributes while (reader. Read()) // C# { if (reader. Has. Attributes) while (reader. Move. To. Next. Attribute()) // Access the attribute name/value via // reader. Name/reader. Value } While reader. Read // VB. NET If reader. Has. Attributes then While reader. Move. To. Next. Attribute() ' Access the attribute name/value via ' reader. Name/reader. Value End While End If End While

The Xml. Text. Reader Properties and Methods n n The properties and methods of

The Xml. Text. Reader Properties and Methods n n The properties and methods of the Xml. Text. Reader are listed started on pg. 272 of the text. Some more germane methods include: n n Read. Inner. Xml() – returns a string with the complete content (including XML markup) of the current node’s content (child nodes, text content, etc. ) Read. Outter. Xml() – returns a string containing the node’s XML markup along with the node’s content XML markup.

The Xml. Text. Reader Properties and Methods n n Run Read. Inner. Outter. Xml-For.

The Xml. Text. Reader Properties and Methods n n Run Read. Inner. Outter. Xml-For. Xml. Text. Reader demo… When reading an XML document, the Xml. Text. Reader class will throw an Xml. Exception if there was an error in parsing the XML. n An error can occur if the XML, for example, is malformed. (That is, it is not well-formed. )

The Xml. Text. Reader Properties and Methods n n Run the Xml. Exception demo

The Xml. Text. Reader Properties and Methods n n Run the Xml. Exception demo We will examine the Xml. Node. Reader and Xml. Validating. Reader – the other two Xml. Reader classes – later in this course.

Using the DOM to Iterate through an XML Document n n n In contrast

Using the DOM to Iterate through an XML Document n n n In contrast to the Push method (Xml. Reader/Xml. Writer), the. NET Framework offers a Pull method. Recall that the Pull method reads the entire XML document into memory and then works with it from there. For this model, XML documents are represented in the Document Object Model (DOM).

What is the DOM? n n n DOM stands for Document Object Model, and

What is the DOM? n n n DOM stands for Document Object Model, and it’s a model that can be used to describe an XML document. The DOM expresses the XML document as a hierarchy of nodes, where each element can have zero to many children elements. The text content and attributes of an element are expressed as its children as well.

Example XML File <? xml version="1. 0" encoding="UTF-8" ? > <books> <book price="34. 95">

Example XML File <? xml version="1. 0" encoding="UTF-8" ? > <books> <book price="34. 95"> <title>TYASP 3. 0</title> <authors> <author>Mitchell</author> </authors> </book> <book price=“ 29. 95"> <title>ASP. NET Tips</title> <authors> <author>Mitchell</author> <author>Walther</author> <author>Seven</author> </authors> </books>

The DOM View of the XML Document

The DOM View of the XML Document

The DOM Classes - Xml. Node n n n There a number of classes

The DOM Classes - Xml. Node n n n There a number of classes in the System. Xml namespace that represent the DOM. Each “box” in the DOM model is represented in the. NET Framework by the Xml. Node class. This means that elements, attributes, and text values are all represented by the Xml. Node class. The Xml. Node class is discussed on pg. 287

Extending the Xml. Node Class n There a number of classes that are derived

Extending the Xml. Node Class n There a number of classes that are derived from the Xml. Node class: n n Xml. Attribute Xml. Element Xml. Document And so on…

The Xml. Node Properties n The Xml. Node class many properties, the most germane

The Xml. Node Properties n The Xml. Node class many properties, the most germane ones being: n Name – the name of the node. For elements and attributes, the name is the name of the element or attribute. For text content, the name is #text. n Value – the value of the DOM element. For elements, there is no value. For attributes, it’s the value of the attribute; for text nodes, it’s the value of the text in the node. n Node. Type – indicates the type of the node (element, text, attribute, etc. )

More Xml. Node Properties n n Inner. Xml – the string content of the

More Xml. Node Properties n n Inner. Xml – the string content of the XML markup of the node’s children. Outer. Xml – the string content of the XML markup of the node itself and its children. Inner. Text – the string content of the value of the node and all its children nodes. Has. Child. Nodes – a Boolean, indicating if the node has any children.

The Xml. Node. List Class n n The Xml. Node. List class represents an

The Xml. Node. List Class n n The Xml. Node. List class represents an arbitrary collection of Xml. Nodes. For example, the Xml. Node class has a Child. Nodes property, which returns an Xml. Node. List instance. This instance is a collection of nodes representing the DOM element’s children.

Loading an XML Document into a DOM Representation n The Xml. Document’s Load() method

Loading an XML Document into a DOM Representation n The Xml. Document’s Load() method has four variations: 1. 2. 3. 4. n Load(Stream) Load(string) Load(Text. Reader) Load(Xml. Text. Reader) In the Load(string) variation, the input string is a file path (or URL) to the XML file to load into the DOM representation.

The Xml. Document Properties n n The Xml. Document is derived from the Xml.

The Xml. Document Properties n n The Xml. Document is derived from the Xml. Node class, meaning it has all of the properties and methods available to the Xml. Node class. Once an XML file has been loaded into an Xml. Document instance, we can access the root element through the Document. Element property.

The Xml. Element and Xml. Attribute Classes n n The Xml. Element and Xml.

The Xml. Element and Xml. Attribute Classes n n The Xml. Element and Xml. Attribute classes are also derived from the Xml. Node class. They represent, respectively, an element and an attribute.

Example n The following loads and XML document and displays the name of the

Example n The following loads and XML document and displays the name of the root element. Dim xml. Doc As New Xml. Document() xml. Doc. Load(filepath) Dim root. Element. Name as String root. Element. Name = xml. Document. Element. Name

Example n Iterating through the root element’s children: Dim xml. Doc As New Xml.

Example n Iterating through the root element’s children: Dim xml. Doc As New Xml. Document() xml. Doc. Load(filepath) Dim n as Xml. Node For Each n in xml. Document. Element. Child. Nodes ' Display the name of the node using n. Name Next

An Example of Iterating through an XML Document n n Let’s create an application

An Example of Iterating through an XML Document n n Let’s create an application that displays an XML document in a Tree. View control. Each node in the Tree. View represents a Node in the DOM

An Example of Iterating through an XML Document n We can recursively iterate through

An Example of Iterating through an XML Document n We can recursively iterate through the DOM, ensuring that we’ll visit each node. (Explain recursion? ) n Examine application code. . . n Questions on the program?

Navigating through an XML Document n n n So far, all we have seen

Navigating through an XML Document n n n So far, all we have seen is how to iterate through an XML document, one node at a time. With the pull method (DOM), however, we can navigate through the document as well. For example, we might want access just the elements in the document that have a certain name. (Such as elements with the name <author>. )

Accessing Elements with a Certain Name n The Xml. Document class contains a Get.

Accessing Elements with a Certain Name n The Xml. Document class contains a Get. Elements. By. Tag. Name() method, which returns an Xml. Node. List containing elements that have the specified tag name. Dim xml. Doc As New Xml. Document() xml. Doc. Load(filepath) Dim n as Xml. Node For Each n in xml. Doc. Get. Elements. By. Tag. Name("author") Display n. Value Next What would be the output of the above code? ? ?

Navigating through an XML Document n n However, what if we want to access

Navigating through an XML Document n n However, what if we want to access nodes based on more complex criteria, such as: “Access all <book> elements with a price attribute value less than 30, ” or, “Access the name of the authors who have written more than one book. ” To accomplish this we need something more powerful – enter XPath!

A Quick Examination of XPath n n XPath is used to define particular sections

A Quick Examination of XPath n n XPath is used to define particular sections of an XML document. XPath is named XPath because its syntax is similar to the syntax for a file path. For example, in our books XML document, we could use the following XPath statement to access all of the author elements: /books/book/authors/author

Why We Might Want to Access Certain XML Document Portions n n n When

Why We Might Want to Access Certain XML Document Portions n n n When using XSLT to display an XML file, typically we want to display only a subset of the XML document. For example, we might want to display a listing of flights, displaying the date, the departure city and the destination city. When working with XML data, we might want to retrieve only a certain subset of the data. We might want to access data that meets a certain set of criteria. All of these tasks can be accomplished with XPath

XPath Components – Steps n To access the root element of the XML document,

XPath Components – Steps n To access the root element of the XML document, we use the following syntax: /Root. Element. Name n n Then, to access immediate descendents (children) of a given element, we use /, followed by the name of the child element. The / operator is referred to as the step operator.

XPath Components – Steps n The step operator has parallels to the  operator

XPath Components – Steps n The step operator has parallels to the operator in file paths. With file systems (which can be modeled as XML documents), you navigate the directory structure by using . For example, a path like: C: GamesQuakeSaved. Games n n This file path - C: GamesQuakeSaved. Games – takes you to the specified directory. A file system can be represented as an XML Document

The file system can <? xml version="1. 0" encoding="UTF-8" ? > be represented as

The file system can <? xml version="1. 0" encoding="UTF-8" ? > be represented as <filesystem> <drive letter="C"> an XML document… <folder name="Program Files" /> <folder name="Games"> <folder name="Quake"> <folder name="Saved. Games" /> <file>Quake. exe</file> <file>README. txt</file> </folder> <folder name="Windows"> <file>README. txt</file> </folder> </drive> <drive letter="D"> <folder name="Backup"> <file>2003 -06 -01. bak</file> <file>2003 -06 -07. bak</file> </folder> </drive> </filesystem>

The DOM Model of the File. System XML Document

The DOM Model of the File. System XML Document

XPath Components - Steps n Using XPath we can access all of the root

XPath Components - Steps n Using XPath we can access all of the root element using: /filesystem

XPath Components - Steps n To access all of the <drive> elements, we’d use:

XPath Components - Steps n To access all of the <drive> elements, we’d use: /filesystem/drive

XPath Components - Steps n To access all of the folder elements that were

XPath Components - Steps n To access all of the folder elements that were children of <drive> elements, we’d use: /filesystem/drive/folder

XPath Components - Steps n What about /filesystem/drive/folder/folder

XPath Components - Steps n What about /filesystem/drive/folder/folder

Descendent Steps n n n Using element. Name/element. Name 2, we get all of

Descendent Steps n n n Using element. Name/element. Name 2, we get all of the elements that are children of element. Name that have the name element. Name 2. But what if we want all elements that are descendents of element. Name, regardless of whether or not the element is a child, grandchild, great-grandchild, etc. ? Here, we use the // operator.

Descendent Steps n As we saw earlier, /filesystem/drive/folder will return the n folders that

Descendent Steps n As we saw earlier, /filesystem/drive/folder will return the n folders that are immediate children of the <drive> element (Program Files, Games, and Window). If we want to get all folders, regardless of their depth in the hierarchy, we can use: /filesystem/drive//folder

Descendent Steps - Example n What will /filesystem//file return?

Descendent Steps - Example n What will /filesystem//file return?

Accessing an Element’s Text Value n n If an element has a text value

Accessing an Element’s Text Value n n If an element has a text value (such as the <file> element), you can access it using the text() XPath function. For example, to return the contents of the <file> elements, we could use: /filesystem/drives//files/text()

Accessing Text Element’s Example n /filesystem/drives//files/text()

Accessing Text Element’s Example n /filesystem/drives//files/text()

Accessing an Element’s Attribute Value n To access an attribute value for all elements

Accessing an Element’s Attribute Value n To access an attribute value for all elements matching a particular XPath expression, use the following syntax: xpath. Expression/@attribute. Name n So, to access the values of the name attribute in the <folder> elements that are children of the <drive> element, you would use: /filesystem/drive/folder/@name

Accessing Element Attribute Values - Example n /filesystem/drive/folder/@name

Accessing Element Attribute Values - Example n /filesystem/drive/folder/@name

Example n n What if you wanted to retrieve the names of subdirectories? That

Example n n What if you wanted to retrieve the names of subdirectories? That is, you wanted to get the name attribute for all <folder> elements that were not children of the <drive> elements? What XPath expression would you use? ? ? /filesystem/drive/folder/@name

Filtering n Imagine that you wanted to return only those folders that contain files.

Filtering n Imagine that you wanted to return only those folders that contain files. Would the following XPath work? /filesystem/drives//folders/file n n No! Because the above would return <file> elements. If you want to return folder elements, filtered to only those that contain files, you can use the following syntax: /filesystem/drives//folders[file]

Filtering Example n /filesystem/drives//folders[file]

Filtering Example n /filesystem/drives//folders[file]

Filtering n Similarly, you can return only elements that contain a certain attribute by

Filtering n Similarly, you can return only elements that contain a certain attribute by using: element. Name[@attribute. Name]

XPath Components Predicates n n n Realize that when using steps, all matching elements

XPath Components Predicates n n n Realize that when using steps, all matching elements are returned. From the file system example, /filesystem/drive/folder will return all four <folder> elements (Program Files, Games, Window, and Backup). Predicates allow to only return those elements that meet a certain set of criteria. Predicate syntax: [boolean expression]

XPath Components Predicates n For example, to return all <folder> elements with the name

XPath Components Predicates n For example, to return all <folder> elements with the name attribute equal to Games, we could use: /filesystem/drive//folder[@name="Games"]

Predicate Example n /filesystem/drive//folder[@name="Games"]

Predicate Example n /filesystem/drive//folder[@name="Games"]

Predicate Example n Predicates can also appear in earlier step expressions, like: /filesystem/drive[@letter="C"]/folder

Predicate Example n Predicates can also appear in earlier step expressions, like: /filesystem/drive[@letter="C"]/folder

Predicates n n A number of operators can be used within predicates: =, !=,

Predicates n n A number of operators can be used within predicates: =, !=, <, >, <=, >=, and, or, not(), +, -, div, *, mod Example: to get all of the files in folders named either Windows or Quake, you could do: /filesystem//folder[@name="Quake" or @name="Windows"]/file

XPath Components – Predicates - Examples n Here are some predicates – what elements

XPath Components – Predicates - Examples n Here are some predicates – what elements would be returned for each? n /filesystem/drive/folder[@name="Quake"] NOTHING IS RETURNED! This is because there is no folder that is a child of the <drive> element that has its name attribute equal to Quake.

XPath Components – Predicates - Examples n What about the following XPath expression? /filesystem/drive//folder[@name="Quake"]

XPath Components – Predicates - Examples n What about the following XPath expression? /filesystem/drive//folder[@name="Quake"] <folder name="Quake"> <folder name="Saved. Games" /> <file>Quake. exe</file> <file>README. txt</file> </folder>

XPath Components – Predicates - Examples n What about the following XPath expression? /filesystem/drive/folder/@name="My

XPath Components – Predicates - Examples n What about the following XPath expression? /filesystem/drive/folder/@name="My Programs" name="Games" name="Windows"

XPath Components – Predicates - Examples n What about the following XPath expression? /filesystem/drive//folder[@name="Quake"]/file

XPath Components – Predicates - Examples n What about the following XPath expression? /filesystem/drive//folder[@name="Quake"]/file <file>Quake. exe</file> <file>README. txt</file>

More on XPath n n There are many more features and much more functionality

More on XPath n n There are many more features and much more functionality available with XPath, which we’ll examine in Session 3. For a good tutorial on XPath, see: http: //www. w 3 schools. com/xpath/default. asp.

Navigating through the DOM using XPath n The Xml. Node class contains two methods

Navigating through the DOM using XPath n The Xml. Node class contains two methods for navigating the DOM: 1. 2. n n n Select. Single. Node(string) Select. Nodes(string) These string input parameter for both of these methods is an XPath expression. Select. Single. Node() returns at most one node, the first node to match the XPath expression. Select. Nodes() returns all of the nodes that match the XPath expression.

An Example n The following code displays the titles of books whose price is

An Example n The following code displays the titles of books whose price is less than $30. 00. Dim xml. Doc As New Xml. Document() xml. Doc. Load(filepath) Dim n as Xml. Node For Each n in _ xml. Doc. Select. Nodes("/books/book[@price<30]/title/text()") Display n. Value Next

An Example n What does the following code output? Dim xml. Doc As New

An Example n What does the following code output? Dim xml. Doc As New Xml. Document() xml. Doc. Load(filepath) Dim n as Xml. Node n = xml. Doc. Select. Single. Node("//author/text()") Display n. Value Answer: the name of the first author found in the XML document.

Summary n n In this presentation, we saw how to programmatically iterate through XML

Summary n n In this presentation, we saw how to programmatically iterate through XML documents. We examined the differences between the push and pull methods. The pull method uses the DOM, while the push method uses Xml. Text. Readers and Xml. Text. Writers.

Summary n n n We studied the syntax of XPath, a technology designed to

Summary n n n We studied the syntax of XPath, a technology designed to allow for XML document navigation. We saw how to use the Select. Single. Node() and Select. Nodes() methods of the Xml. Node class to navigate an XML document navigation is only possible in the DOM world.

Questions?

Questions?