Introduction to XML Kostas Kontogiannis Evan Mamas Outline

  • Slides: 44
Download presentation
Introduction to XML Kostas Kontogiannis Evan Mamas

Introduction to XML Kostas Kontogiannis Evan Mamas

Outline z. Introduce XML, HTML and SGML z. Compare and Contrast y. XML vs.

Outline z. Introduce XML, HTML and SGML z. Compare and Contrast y. XML vs. HTML y. XML vs. SGML z. XML y. Components, Applications, Industry z. Thoughts on XML

What is XML? z e. Xtensible Markup Language z Proper subset of SGML for

What is XML? z e. Xtensible Markup Language z Proper subset of SGML for web use z Meta-language y Allows you to create your own markup languages z Compromise between HTML and SGML

What is HTML ? z Hyper. Text Markup Language z Language to describe information

What is HTML ? z Hyper. Text Markup Language z Language to describe information for transmission over the web. z Uses tags to markup the information z Tags are just a formatting tool z Example y<H 1> Hello, World </H 1> y. Hello, World

Why isn’t HTML enough? z Good enough for presenting text on the web z

Why isn’t HTML enough? z Good enough for presenting text on the web z Not accepted as an authoring or archival form z Extensibility y. HTML standard changes continually y. Uses tags formatting z Structures y. Has no defined or definable structural rules

What is SGML ? z Standard Generalized Markup Language y. International Standard for over

What is SGML ? z Standard Generalized Markup Language y. International Standard for over 10 years z Language for specifying markup languages z Describes only the formal properties and interrelations of the components of a document z Document, Entities, Elements, Attributes

Uses of SGML z Formally structured documents y. Technical Manuals z Exchange documents y.

Uses of SGML z Formally structured documents y. Technical Manuals z Exchange documents y. Product documentation z Data encoding z Interchange specification z Provide long-term storage of information which was independent of suppliers and changes in h/w and s/w

SGML Example z. Memo <to>All staff <from>Martin Bryan <date>5 th November <subject>Cats and Dogs

SGML Example z. Memo <to>All staff <from>Martin Bryan <date>5 th November <subject>Cats and Dogs <text>Please remember to keep all cats and dogs indoors tonight. z. DTD (Document Type Definition) <!DOCTYPE memo [ <!ELEMENT memo O O ((to & from & date & subject? ), text) > <!ELEMENT text - O (para+) > <!ELEMENT para O O (#PCDATA) > <!ELEMENT (to, from, date, subject) - O (#PCDATA) > ]>

Why isn’t SGML enough? z Specification is very long z Contains many options not

Why isn’t SGML enough? z Specification is very long z Contains many options not needed for Web applications z Time consuming and high cost y. Expensive tools z Too much for small applications z Bad reputation

XML vs. HTML z. New tags and attributes definitions allowed z. Document structures can

XML vs. HTML z. New tags and attributes definitions allowed z. Document structures can be nested to any level of complexity z. Structural validation is possible by describing the grammar

XML vs. SGML z. XML is the minimum required subset of SGML for web

XML vs. SGML z. XML is the minimum required subset of SGML for web use z. Easier to implement and to create tools for z. A new attempt at structured markup languages with a new “face”

XML Components z XML Style Language (XSL) z Cascading Style Sheets, level 2 CCS

XML Components z XML Style Language (XSL) z Cascading Style Sheets, level 2 CCS 2 z XML Document Object Model (DOM) z XML Linking Language (XLL) z XML Pointer Language (XPL) z XML Name Spaces z Synchronized Multimedia Integration Language (SMIL) z Resource Description Framework (RDF) z Mathematical Markup Language (Math. ML)

XML Components (cont. ) z XML Style Language (XSL) y Defines a way to

XML Components (cont. ) z XML Style Language (XSL) y Defines a way to present the documents y Separates formatting from content y Has two steps: x. Generate a result tree (associate patterns with templates) x. Use XML Namespace (formatting vocabulary) to generate formatted output. y Similar to DSSSL for SGML

XML Components (cont. ) z Cascading Style Sheets, level 2 CCS 2 y Defines

XML Components (cont. ) z Cascading Style Sheets, level 2 CCS 2 y Defines a way to present documents y Similar to XSL (Not as strong) y Supported by most browsers <HTML> <TITLE>Bach's home page</TITLE> <STYLE type="text/css"> H 1 { color: blue } </STYLE> <BODY> <H 1>Bach's home page</H 1> <P>Johann Sebastian Bach was a prolific composer. </BODY> </HTML>

XML Components (cont. ) z XML Document Object Model (DOM) y In-memory model for

XML Components (cont. ) z XML Document Object Model (DOM) y In-memory model for representing parsed XML documents y Designed to provide common structures in XML browsers y Intended to enable interoperable XML processing across browsers y Implemented by Internet Explorer and Netscape

XML Components (cont. ) z XML Linking Language (XLL) y Links by reference rather

XML Components (cont. ) z XML Linking Language (XLL) y Links by reference rather than exact location y Provides hyperlinking elements x. Simple links like HTML links x. Extended • Multi-directional links • Links with multiple destinations • Placing content inline from a linked document y Requires use of XML Pointer Language

XML Components (cont. ) z XML Name Spaces y Vocabulary of all elements and

XML Components (cont. ) z XML Name Spaces y Vocabulary of all elements and attribute types x. Namespace prefix (mapped to Uniform Redource Identifier) x. Local Part y Allows use of names defined in other documents y Modularity and reuse of a markup y Mechanisms to establish name scope

XML Components (cont. ) z Synchronized Multimedia Integration Language (SMIL) y Language for describing

XML Components (cont. ) z Synchronized Multimedia Integration Language (SMIL) y Language for describing interactive synchronized multimedia distributed on the Web y Several components (images, video, audio) can be linked together to create a presentation on the web z Resource Description Framework (RDF) y Abstract mechanism for defining simple relationships among web resources z Mathematical Markup Language (Math. ML) y Language to describe mathematical expressions

XML DTD z. Defines the hierarchy of all user-defined elements (tags) in the XML

XML DTD z. Defines the hierarchy of all user-defined elements (tags) in the XML document z. Declares the attributes and behaviour of each XML element z. Each XML document calls a specific DTD file to validate its elements

XML DTD z z <? xml version="1. 0" encoding="UTF-8"? > <!-- DTD for a

XML DTD z z <? xml version="1. 0" encoding="UTF-8"? > <!-- DTD for a simple program beginning of element declarations--> z <!--the root tag of Language--> z z <!ELEMENT z z z <!ELEMENT Declaration (Type_Name|Identifier)*> <!ELEMENT Type_Name (#PCDATA)*> <!ELEMENT Identifier (#PCDATA)*> z z z <!ELEMENT Function_Call (Return_Type*, Function_Name*, Argument*)> <!ELEMENT Return_Type (Return_Var*)> <!ELEMENT Return_Var (#PCDATA)> z z z <!ELEMENT Function_Name (#PCDATA)> <!ELEMENT Argument (parameter. Name*)> <!ELEMENT parameter. Name (#PCDATA)> <!--We may want to have external calls or graphics in our document. Currently there is none, but we still have to declare them--> <!ELEMENT External_Call EMPTY> <!ELEMENT Graphics EMPTY> z <!--end of element declarations--> Language (File. Tag*, Declaration*, Function_Call*)> File. Tag (Include. Tag*, Source. Tag*)> Include. Tag (#PCDATA)*> Source. Tag (#PCDATA)*> Defines what other tags are within the <Language> tag Defines data types for contents within the <Include. Tag> tag

XML Document (page 1 of 2) z z z z <? xml version="1. 0"?

XML Document (page 1 of 2) z z z z <? xml version="1. 0"? > <? xml: stylesheet type="text/xsl" href="student. XSL 1. xsl" ? > <!DOCTYPE Language SYSTEM "Student. dtd"> <Language> <File. Tag> <Include. Tag>include stdio. h: </Include. Tag> </File. Tag> <Include. Tag>include math. h</Include. Tag> </File. Tag> <Source. Tag>code statement 3: </Source. Tag> </File. Tag> <Source. Tag>code statement 2: </Source. Tag> </File. Tag> z z <Declaration> <Type_Name>char*</Type_Name> <Identifier>UW</Identifier> </Declaration> Calls a XSL style sheet Calls a DTD document

XML Document z z <Declaration> <Type_Name>int</Type_Name> <Identifier>num. Ofstudents</Identifier> </Declaration> z z <Declaration> <Type_Name>char*</Type_Name> <Identifier>faculty.

XML Document z z <Declaration> <Type_Name>int</Type_Name> <Identifier>num. Ofstudents</Identifier> </Declaration> z z <Declaration> <Type_Name>char*</Type_Name> <Identifier>faculty. Name</Identifier> </Declaration> z z z z z <Function_Call> <Return_Type> <Return_Var>student_profile</Return_Var> </Return_Type> <Function_Name>elec_eng</Function_Name> <Argument> <parameter. Name>name</parameter. Name> </Argument> </Function_Call> </Language> (page 2 of 2)

XML Namespaces z. Latest milestone for W 3 C's XML technology (14 -January-1999 )

XML Namespaces z. Latest milestone for W 3 C's XML technology (14 -January-1999 ) z. W 3 C’s definition of XML Name. Spaces: y “XML namespaces provide a simple method for qualifying element and attribute names used in Extensible Markup Language documents by associating them with namespaces identified by URI references. ” z. Why use it? x. Maintain tag meaningfulness and uniqueness z. How does it solve the problem? x. Add context to XML tags by using prefix and URL

XSL Document z z z z z (Page 1 of 3) <? xml version="1.

XSL Document z z z z z (Page 1 of 3) <? xml version="1. 0"? > <DIV xmlns: xsl="http: //www. w 3. org/TR/WD-xsl"> <html: html xmlns: html="http: //www. w 3. org/TR/REC-html 40"> <i>This page consists of XML, XSL, Namespace, HTML, and Java Applet</i> <html: head><html: title><H 1>Sample C Code (hidden XML tag)</H 1></html: title></html: head> Namespace for XSL Namespace for HTML <xsl: for-each select="Language"> <TD STYLE="padding-left: 1 em"> <DIV><xsl: value-of select="/"/></DIV> <html: font color="red">The above command prints out all contents within tags without any formmating, ordering, linebreaks, etc. </html: font> </TD> </xsl: for-each> <xsl: for-each order-by="+ Include. Tag" select="Language/File. Tag"> <TD STYLE="padding-left: 1 em"> <html: BR></html: BR> <DIV><html: BR><xsl: value-of select="Include. Tag"/></html: BR></DIV> </TD> </xsl: for-each> <html: font color="red">End of Include. Tag, ascending sort on Include Tag Content</html: font>

XSL Document z z z z z (Page 2 of 3) <xsl: for-each order-by="+

XSL Document z z z z z (Page 2 of 3) <xsl: for-each order-by="+ Source. Tag" select="Language/File. Tag"> <TD STYLE="padding-left: 1 em"> <html: BR></html: BR> <DIV><xsl: value-of page-break-after="Source. Tag" select="Source. Tag"/></DIV> </TD> </xsl: for-each> <html: font color="red">End of Source. Tag, ascending sort on Source. Tag Content</html: font> <html: BR></html: BR> <xsl: for-each order-by="+ Type_Name" select="Language/Declaration"> <TD STYLE="padding-left: 1 em"> <html: BR></html: BR> <DIV><html: BR><xsl: value-of select="Type_Name"/></html: BR></DIV> <DIV><html: BR><xsl: value-of select="Identifier"/></html: BR></DIV> </TD> </xsl: for-each> <html: font color="red">End of Declaration, ascending sort on Type_Name</html: font> <DIV></DIV>

XSL Document z z z z z (Page 3 of 3) <xsl: for-each select="Language/Function_Call">

XSL Document z z z z z (Page 3 of 3) <xsl: for-each select="Language/Function_Call"> <TD STYLE="padding-left: 1 em"> <html: BR><DIV><xsl: value-of select="Return_Type"/></DIV></html: BR> <html: font color="red">End of Return_Type</html: font> <html: BR><DIV><xsl: value-of select="Function_Name"/></DIV></html: BR> <html: font color="red">End of Function_Name</html: font> <html: BR></html: BR> <html: BR><DIV><xsl: value-of select="Argument"/></DIV></html: BR> <html: font color="red">End of Argument</html: font> <html: BR></html: BR> </TD> </xsl: for-each> <html: BR></html: BR> <html: APPLET code="Agent. Action. class" width="400" height="200"></html: APPLET> <html: BR></html: BR> </html: html> </DIV>

Applications that require XML z. Information exchange between heterogeneous databases y. Health care example

Applications that require XML z. Information exchange between heterogeneous databases y. Health care example z. Distributed processing y. Semiconductor industry example z. Multiple views of the same data z“Intelligent” information agents

Using XML z XML for Storage y. Compact syntax y. Generalized and standarized y.

Using XML z XML for Storage y. Compact syntax y. Generalized and standarized y. Product independent z XML for Searching y. Use of content specific markup enables robust searching y. Search engines need to be XML aware y. Can use current SGML search engines

What is DOM? z. A programming API for XML zlogical structure of document z.

What is DOM? z. A programming API for XML zlogical structure of document z. Access and Manipulation of documents

What is DOM? z. As an object model, DOM identifies y. Interface and Objects

What is DOM? z. As an object model, DOM identifies y. Interface and Objects used for the doc. y. Behaviours and Attributes y. Relationships and Collaborations of Interfaces and Objects

What is DOM? z 2 Major Components for DOM Level 1 y. DOM Core

What is DOM? z 2 Major Components for DOM Level 1 y. DOM Core = Basic functionalities for XML y. DOM HTML = Objects and Methods specific to HTML z. Level 2 y. DOM CSS, DOM Event, DOM Filters and Iterators, DOM Range

Advantages of using DOM z. Easy to create, navigate, add, modify documents z. DOM

Advantages of using DOM z. Easy to create, navigate, add, modify documents z. DOM abstraction avoids implementation dependencies z. DOM applications may use additional language bindings

A Typical DOM Structure <condition_statement> <if_tag> if </if_tag> <expression_tag> (b == c) </expression_tag> <statement_tag>

A Typical DOM Structure <condition_statement> <if_tag> if </if_tag> <expression_tag> (b == c) </expression_tag> <statement_tag> {a += c} </statement_tag> </if_statement> </condition_statement>

A Typical DOM Structure (2) <condition_statements> <if_tag>> if <expression_tag> (b==c) <statement_tag> {a+=c}

A Typical DOM Structure (2) <condition_statements> <if_tag>> if <expression_tag> (b==c) <statement_tag> {a+=c}

A Typical DOM Structure (3) z. DOM abstraction is a Tree or Forest Structure

A Typical DOM Structure (3) z. DOM abstraction is a Tree or Forest Structure z. Users have full flexibility to specify the structure z. Structural Isomorphism

Some Key Objects z. Node y. Tree node of the document yroot node, parents

Some Key Objects z. Node y. Tree node of the document yroot node, parents and children z. Element (is a Node object) y. Elements of a document y. Represents contents between the start tag and end tag y. Attributes: defined by DTD

Some Key Objects (2) z. Document yroot node of a document z. Node. Iterator

Some Key Objects (2) z. Document yroot node of a document z. Node. Iterator yiterates over a set of nodes specified by a filter z. Attribute. List ycollection of Attribute objects, indexed by attribute name

Some Key Objects (3) z. Attribute yattribute of an Element Object z. Document. Context

Some Key Objects (3) z. Attribute yattribute of an Element Object z. Document. Context yrespository for metadata about a document z. DOM yprovides instance-independent document operations

Memory Management for DOM z. DOM APIs operate across a variety of memory implementation

Memory Management for DOM z. DOM APIs operate across a variety of memory implementation methods: y. Language platforms that do not expose memory management to user y. Language (Java) that provides constructors with Garbage collection capability y. Language (C/C++) that requires explicit memory allocations

Resources/Quirks y. IE 5 and Navigator 5. 0 implement different features: x. IE 5.

Resources/Quirks y. IE 5 and Navigator 5. 0 implement different features: x. IE 5. 0 - XML/XSL Navigator - XML/CSS x. Navigator to support RDF y. XML Resources: xhttp: //www. swen. uwaterloo. ca/~group 1

Using XML (cont. ) z XML for Presentation y. Convert to HTML at server

Using XML (cont. ) z XML for Presentation y. Convert to HTML at server y. Use Java applications to render in browser x. Slow y. Use XSL or CSS to render in browser x. Fast

XML in the industry z. Explosive growth of XML tools and specifications y. Tools:

XML in the industry z. Explosive growth of XML tools and specifications y. Tools: JADE, MSXML, JUMBO, . . . y. Specifications: CDF, CFML, EDI y. Browsers: IE, Netscape

Thoughts on XML z Seems like a transition stage between HTML and SGML y.

Thoughts on XML z Seems like a transition stage between HTML and SGML y. Will we eventually end up using SGML? z XML follows basic principles of SE y. Higher abstraction layer y. Reuse y. Modularity

References z XML. COM - A guide to XML y http: //www. xml. com/xml/pub/w

References z XML. COM - A guide to XML y http: //www. xml. com/xml/pub/w 3 j/s 3. walsh. html z XML. COM - The Road to XML: Adapting SGML to the Web y http: //www. xml. com/xml/pub/w 3 j/s 1. discussion. html z The Computer Bulletin - The XML Files y http: //www. bcs. org. uk/publicat/ebull/may 98/xml. htm z XML, Java, and the future of the Web y http: //sunsite. unc. edu/pub/sun-info/standards/xml/why/xmlapps. htm z XML: What is it y http: //iai. sgml. com/980106 -01. asp z Why do we need XML? y http: //info. admin. kth. se/SGML/Konferenser/xml 98 sve/seminar. html z An Introduction to the Standard Generalized Markup Language y http: //www. personal. u-net. com/~sgml/sgml. htm z SGML 101 y http: //www. uslynx. com/sgml 101. htm