Introduction to XML What is XML Extensible Markup
Introduction to XML
What is XML? Extensible Markup Language XML 1. 0 1998 Easier-to-use subset of SGML (Standard Generalized Markup Language) XML is a text-based markup language Standard for data interchange on the web Set of rules for designing semantic tags Meta-markup language to define other languages XML 1. 0 Specification http: //www. w 3. org/TR/REC-xml School of Information Science and Technology Prof. John Yen
HTML and XML HTML is an application of SGML XML is a subset of SGML XHTML is an application of XML School of Information Science and Technology Prof. John Yen
XML File Sample <? xml version="1. 0"? > <dining-room> <manufacturer>The Wood Shop</manufacturer> <table type="round" wood="maple"> <price>$199. 99</price> </table> <chair wood="maple"> <quantity>6</quantity> <price>$39. 99</price> </chair> </dining-room> School of Information Science and Technology Prof. John Yen
XML describes Structure and Semantics, Not Formatting HTML Example <DL> <DT>Mambo <DD>by Enrique Garcia </DL> <UL> <LI>Producer: Enrique Garcia <LI>Publisher: Sony Music Entertainment <LI>Length: 3: 46 <LI>Written: 1991 <LI>Artist: Azucar Moreno </UL> School of Information Science and Technology Prof. John Yen
XML describes Structure and Semantics, Not Formatting (2) XML Example <SONG> <TITLE>Mambo</TITLE> <COMPOSER>Enrique Garcia</COMPOSER> <PRODUCER>Enrique Garcia</PRODUCER> <PUBLISHER>Sony Music Entertainment</PUBLISHER> <LENGTH>3: 46</LENGTH> <YEAR>1991</YEAR> <ARTIST>Azucar Moreno</ARTIST> </SONG> School of Information Science and Technology Prof. John Yen
What's So Great About XML? Easy Data Exchange Growth of proprietary data formats Conversion Programs (Applications, versions. . ) Data and markup are stored as text Avoid store simple data in huge files School of Information Science and Technology Prof. John Yen
What's So Great About XML? (2) Customizing Markup Languages Banking Industry Technology Secretariat (BITS) Financial Exchange (IFX) Schools Interoperability Framework (SIF) Common Business Library (CBL) Electronic Business XML Initiative (eb. XML) The Text Encoding Initiative (TEI) School of Information Science and Technology Prof. John Yen
What's So Great About XML? (3) Self-Describing Data <? xml version="1. 0" encoding="UTF-8"? > <DOCUMENT> <GREETING>Hello from XML</GREETING> <MESSAGE>Welcome to Programming XML in Java</MESSAGE> </DOCUMENT> School of Information Science and Technology Prof. John Yen
What's So Great About XML? (4) Structured and Integrated Data <? xml version="1. 0"? > <SCHOOL> <CLASS type="seminar"> <CLASS_TITLE>XML In The Real World</CLASS_TITLE> <CLASS_NUMBER>6. 031</CLASS_NUMBER> <SUBJECT>XML</SUBJECT> <START_DATE>6/1/2002</START_DATE> <STUDENTS> <STUDENT status="attending"> <FIRST_NAME>Edward</FIRST_NAME> <LAST_NAME>Samson</LAST_NAME> </STUDENT> <STUDENT status="withdrawn"> <FIRST_NAME>Ernestine</FIRST_NAME> <LAST_NAME>Johnson</LAST_NAME> </STUDENTS> </CLASS> </SCHOOL> School of Information Science and Technology Prof. John Yen
Well-Formed XML Documents Follow the syntax rules setup for XML by W 3 C in The XML 1. 0 Specification (www. w 3. org/TR/REC-xml) Contain one or more elements Root element must contain all the other elements Each element nest inside any enclosing elements properly School of Information Science and Technology Prof. John Yen
Valid XML Documents Has the same function as Idoc type definition Association with a Document Type Definition (DTD) Comply with that DTD <? xml version="1. 0" encoding="UTF-8"? > <? xml-stylesheet type="text/css" href="first. css"? > <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (GREETING, MESSAGE)> <!ELEMENT GREETING (#PCDATA)> <!ELEMENT MESSAGE (#PCDATA)> ]> <DOCUMENT> <GREETING>Hello from XML</GREETING> <MESSAGE>Welcome to Programing XML in Java</MESSAGE> </DOCUMENT> More on DTD http: //www. cs. rpi. edu/~puninj/XMLJ/classes/class 3/Overview. html School of Information Science and Technology Prof. John Yen
Related Technologies Hypertext Markup Language HTML most common output format of XML Web Browsers: Internet Explorer 5. 0, Netscape 6. 0 Different way to design a Web site. School of Information Science and Technology Prof. John Yen
Related Technologies (2) Cascading Style Sheets Define formatting properties n n n Font Size Font family Font weight Paragraph indentation Paragraph alignment Multiple style sheets can be applied to a single document Multiple styles can be applied to a single element. School of Information Science and Technology Prof. John Yen
Related Technologies (3) The Unicode Character Set American Standard Code for Information Interchange (ASCII) 0 -255 'A' - 65 XML provides full Support for the two-byte Unicode Character Set. 0 -65, 535 http: //www. unicode. org XML Documents written in: n n ASCII UTF-8 Compressed version of Unicode (uses 8 bits to represent characters) <? xml version="1. 0" encoding="UTF-8"? > XML defines character reference to encode Unicode characters. &#x. A 9; < &#x 03 C 0; Universal Character System (UCS ISO 10646) n n 4 bytes per symbol UCS-2 and UCS-4 encoding School of Information Science and Technology Prof. John Yen
How Do I Use XML? XML Document is parsed Data is manipulated APIs available in Java, C, C++, Perl. . School of Information Science and Technology Prof. John Yen
Simple API for XML - SAX Event-based framework for parsing XML data Methods such as start. Document(), end. Element() Set of errors and warnings http: //www. megginson. com/SAX Several parsers can be plugged into the SAX API School of Information Science and Technology Prof. John Yen
Document Object Model - DOM Manipulation of XML Data Provides a representation of an XML Document as a tree. Reads XML Document into memory http: //www. w 3. org/DOM School of Information Science and Technology Prof. John Yen
Sun's Java API for XML Parsing - JAXP Provide cohesiveness to the SAX and DOM APIs Add convenient methods for Java developers http: //java. sun. com/xml School of Information Science and Technology Prof. John Yen
Java and XML: A Perfect Match Java is portable code, XML is portable data Applications completely portable n n Java Virtual Machine (JVM) Standards-based data layer Java provides the most robust set of: n n n APIs - JAXP Parsers - XP Processors - Saxon Publishing Frameworks - Cocoon Tools for XML - XML Pro School of Information Science and Technology Prof. John Yen
The Life of an XML Document School of Information Science and Technology Prof. John Yen
XML Editors Create XML documents Text Editors - vi, emacs, notepad XML Editors n n n Adobe Frame. Maker, www. adobe. com XML Pro, www. vervet. com XML Writer, xmlwriter. net XML Notepad, msdn. microsoft. com/xml/notepad/intro. asp XMetal from Soft. Quad, xmetal. com XML Spy, www. xmlspy. com School of Information Science and Technology Prof. John Yen
XML Editors (XML Spy) School of Information Science and Technology Prof. John Yen
More XML Spy School of Information Science and Technology Prof. John Yen
XML Parsers Read XML Document Verify that XML is well formed Verify that XML is valid n n n n expat, parser written in C by James Clark (www. jclark. com) XML for Java (XML 4 J), from IBM Alphaworks (www. alphawors. ibm. com/tech/xml 4 j) Lark, written in Java (www. textuality. com/Lark/) Apache Xerces (www. apache. org) XP by James Clark (www. jclark. com) Oracle XML Parser (technet. oracle. com/tech/xml) Sun Microsystems Project X (java. sun. com/products/xml) School of Information Science and Technology Prof. John Yen
XML Validators Verify that XML is valid XML. com's Validator based on Lark (xml. com) Language Technology Group at the University of Edinburgh's validator based on the RXP Parser www. ltg. ed. ac. uk/~richard/xmlcheck. html Scholarly Technology Group at Brown University's validator www. stg. brown. edu/service/xmlvalid/ School of Information Science and Technology Prof. John Yen
School of Information Science and Technology Prof. John Yen
XML Browsers Display the Data to the User Internet Explorer 5 n n Display directly XML Documents Handle XML in scripting Languages (JScript, VBScript) Bind XML to Active. X Data Object (ADO) database recordsets XML integrated into the Office 2000 suite of applications Netscape Navigator 6 n n n Display directly XML Documents Handle XML in scripting Languages (Javascript 1. 5) Support the XML-based User Interface Language (XUL). XUL lets you configure the controls in the browser Jumbo n n Display XML Use CML to draw molecules School of Information Science and Technology Prof. John Yen
XML Resources XML at W 3 C (http: //www. w 3. org/xml/) XML. com (http: //www. xml. com/ XML. org Registry (http: //www. xml. org/) XML Cover Pages (http: //xml. coverpages. org/) Java and XML (http: //java. sun. com/xml/) School of Information Science and Technology Prof. John Yen
XML Applications Languages based on XML Chemical Markup Language (CML) Mathematical Markup Language (Math. ML) Channel Definition Format (CDF) Synchronized Multimedia Integration Language (SMIL) XHTML Scalable Vector Graphics (SVG) Music. ML Vox. ML School of Information Science and Technology Prof. John Yen
XML and Idoc Mapping XML DTD -> Idoc type n n n XML tree structure -> Idoc tree structure XML parent element -> Idoc parent segment XML child element -> Idoc child segment or field XML document -> Idoc document School of Information Science and Technology Prof. John Yen
XML DTD and Idoc Type Mapping Example School of Information Science and Technology Prof. John Yen <!-- MATMAS 01 Material Master --> <!ELEMENT MATMAS 01 (IDOC+) > <!ELEMENT IDOC (EDI_DC 40, E 1 MARAM+) > <!-- IDoc Control Record for Interface to External System --> <!ELEMENT EDI_DC 40 (TABNAM, MANDT? , DOCNUM? , DOCREL? , STATUS? , DIRECT, OUTMOD? , EXPRSS? , TEST? , IDOCTYP, CIMTYP? , MESTYP, MESCOD? , MESFCT? , STDVRS? , STDMES? , SNDPOR, SNDPRT, SNDPFC? , SNDPRN, SNDSAD? , SNDLAD? , RCVPOR, RCVPRT, RCVPFC? , RCVPRN, RCVSAD? , RCVLAD? , CREDAT? , CRETIM? , REFINT? , REFGRP? , REFMES? , ARCKEY? , SERIAL? ) > <!-- Segment E 1 MARAM : Master material general data (MARA) --> <!ELEMENT E 1 MARAM (MSGFN? , MATNR? , ERSDA? , ERNAM? , LAEDA? , AENAM? , PSTAT? , LVORM? , MTART? , MBRSH? , MATKL? , BISMT? , MEINS? , BSTME? , ZEINR? , ZEIAR? , ZEIVR? , ZEIFO? , AESZN? , BLATT? , BLANZ? , FERTH? , FORMT? , GROES? , WRKST? , NORMT? , LABOR? , EKWSL? , BRGEW? , NTGEW? , GEWEI? , VOLUM? , VOLEH? , BEHVO? , RAUBE? , TEMPB? , TRAGR? , STOFF? , SPART? , KUNNR? , WESCH? , BWVOR? , BWSCL? , SAISO? , ETIAR? , ETIFO? , EAN 11? , NUMTP? , LAENG? , BREIT? , HOEHE? , MEABM? , PRDHA? , CADKZ? , ERGEW? , ERGEI? , ERVOL? , ERVOE? , GEWTO? , VOLTO? , VABME? , KZKFG? , XCHPF? , VHART? , FUELG? , STFAK? , MAGRV? , BEGRU? , QMPUR? , RBNRM? , MHDRZ? , MHDHB? , MHDLP? , VPSTA? , EXTWG? , MSTAE? , MSTAV? , MSTDE? , MSTDV? , KZUMW? , KOSCH? , NRFHG? , MFRPN? , MFRNR? , BMATN? , MPROF? , PROFL? , IHIVI? , ILOOS? , KZGVH? , XGCHP? , COMPL? , KZEFF? , RDMHD? , IPRKZ? , PRZUS? , MTPOS_MARA? , GEWTO_NEW? , VOLTO_NEW? , WRKST_NEW? , E 1 MAKTM+, E 1 MARCM*, E 1 MARMM*, E 1 MBEWM*, E 1 MLGNM*, E 1 MVKEM*, E 1 MLANM*, E 1 MTXHM*) > <!-- Segment E 1 MAKTM : Master material short texts (MAKT) --> <!-- Field MATNR in E 1 MARAM: Material number --> <!ELEMENT MATNR (#PCDATA) >
XML and Idoc Document Mapping Example <? xml version="1. 0"? > <MATMAS 01> <E 1 MARAM> <E 1 MAKTM> <MSGFN>005</MSGFN> <SPRAS>D</SPRAS> <MAKTX>Zwischenlage </MAKTX> </E 1 MAKTM> </E 1 MARAM> </MATMAS 01> School of Information Science and Technology Prof. John Yen
Recommended Classes on XML This presentation is modified from material on XML at http: //www. cs. rpi. edu/~puninj/XMLJ/classes. html PSU web based training (WBT) also provides excellent XML courses: n 13182 XML Technology Overview w basic concepts such as XML, DTD, XLS n 86031 XML Programming Part 1 w similar to the one above, runs consistent with the second part. n 86032 XML Programming Part 2 w Advanced topics include XML programming interfaces DOM, SAX, XML translator XSLT n 13182 is good for learning concepts of XML. 86031/32 are good for learning how to code with XML in java or C++. School of Information Science and Technology Prof. John Yen
In Class Assignment (10 HW) Develop a DTD XML definition for Delivery notes, which includes n n Date Vendor (value is a vendor ID) Quantity Product (value is a product ID) School of Information Science and Technology Prof. John Yen
- Slides: 35