CERN European Organization for Nuclear Research IT Department

  • Slides: 36
Download presentation
CERN – European Organization for Nuclear Research IT Department – e Business Section –

CERN – European Organization for Nuclear Research IT Department – e Business Section – Practical Use of XML Rostislav Titov IT-AIS-EB (e-Business) Section CERN – Geneva, Switzerland

XML e. Xtensible Markup Language l SGML (ISO standard, 1986) Mainly for technical documentation

XML e. Xtensible Markup Language l SGML (ISO standard, 1986) Mainly for technical documentation l XML (W 3 C recommendation, 1998) Simplification and enhancement of SGML, wide area of use CERN e Business –

Why Markup? <book lang=“Hungarian”> <chapter> Introduction Введение Текст </section> <section> Text Разметка <section> Markup

Why Markup? <book lang=“Hungarian”> <chapter> Introduction Введение Текст </section> <section> Text Разметка <section> Markup </section> </chapter> <chapter> More document markup Дополнительные данные о разметке Reserved attributes </section> атрибуты <section> Зарезервированные Инструкцииinstructions по обработке <section> Processing </section> </chapter> </book> CERN e Business – Markup allows to add information about data structure

XML: Rules Header l One root element l Tag hierarchy l Attributes l Text

XML: Rules Header l One root element l Tag hierarchy l Attributes l Text elements l Empty elements l <? xml version="1. 0" encoding="UTF-8"? > <presentation> <author> <firstname>Rostislav</firstname> <lastname>Titov</lastname> </author> <chapter number="1" title="What is XML"> XML (Extensible Markup Language) is … </chapter> <conclusion/> </presentation> Some rules l Element names are case-sensitive l Every opening tag should have a closing tag l Tags cannot intersect (<a><b></a></b>) l Attributes values – in quotes or apostrophes CERN e Business –

XML: Data Transfer l Platform and language independent l Easy to write, easy to

XML: Data Transfer l Platform and language independent l Easy to write, easy to process l Understandable for humans and computers l Open standard – Many libraries exist – Lots of literature available – Specialized XML-editors l Possibility to check the document structure CERN e Business –

XML: Data Transfer (2) Example: EDH Transport Request XML External Program EDH l Automatic

XML: Data Transfer (2) Example: EDH Transport Request XML External Program EDH l Automatic form generation from external programs l XML as data transfer format l Schema checkup as a warranty of data consistency CERN e Business –

Web Services Data transfer between programs on Internet l Open Standard l Platform and

Web Services Data transfer between programs on Internet l Open Standard l Platform and language independent (Java, . Net, …) l Web service XML SOAP XML WSDL – Web Service Definition Language SOAP – Simple Object Access Protocol CERN e Business –

XML: Data Storage l Data structure is kept together with the data l Object

XML: Data Storage l Data structure is kept together with the data l Object “addendum” to relational RDBMS l Structure checkup l Supported by many modern RDBMS – Microsoft SQL Server 2005, Oracle 9 i +, – XML Data Type – XML indexes – XML Queries (XQuery etc. ) – Data output in XML format CERN e Business –

XML: Data Storage (2) Example: EDH Search System Problem: Effective search using arbitrary number

XML: Data Storage (2) Example: EDH Search System Problem: Effective search using arbitrary number of criteria is problematic Our solution: l All documents are stored in XML l Context-specific XML search (Oracle Inter. Media) Example: «Find documents created by Slava» : Select DOC_ID from DOC_XML where Contains(XML, “Slava within creator”) > 0; CERN e Business –

XML: Data Transformations l XML can be transformed into HTML, text, PDF, . .

XML: Data Transformations l XML can be transformed into HTML, text, PDF, . . . – No need for special program solutions – Commercial visual editors exist – Platform independent CERN e Business –

XML-based Standards l Possibility to formally define the structure l Platform and language independent

XML-based Standards l Possibility to formally define the structure l Platform and language independent l Understandable for humans and computers l Possibility to use XML technologies (XSLT transformations, XQuery queries)… – WSDL (Web Services Definition Language) – SOAP (Simple Object Access Protocol) – XHTML (HTML that complies to XML rules) – SVG (Scalable Vector Graphics) – eb. XML (XML for e-Business) – … CERN e Business –

Formal Structure Definition l There are ways to define XML structure formally Obsolete! Not

Formal Structure Definition l There are ways to define XML structure formally Obsolete! Not for new development • DTD (Document Type Definition) • XML Schema CERN e Business –

XML Schema: Possibilities l Check element presence and their order l Sequences and choices

XML Schema: Possibilities l Check element presence and their order l Sequences and choices l Number of repetitions for elements and groups l Attributes and their presence l Type of elements and attributes l Restrictions for elements and attributes l Default values l Unique constraints l . . . CERN e Business –

XML-schema: when it is needed? l Formal structure definition for future reference l Programmers

XML-schema: when it is needed? l Formal structure definition for future reference l Programmers may rely on data consistence l Authors advance CERN e Business – may check XML validness in

XML-schema: when NOT needed? l When we know in advance that XML is valid

XML-schema: when NOT needed? l When we know in advance that XML is valid l When we do not care about document validness l When maximum processing speed is required l Small CERN e Business – “throw away” projects

XPath: XML Navigation l Access to XML elements l Result of an XPATH-expression can

XPath: XML Navigation l Access to XML elements l Result of an XPATH-expression can be: XML Node l Node Set l Boolean l C: presentationauthorfirstname CERN e Business – String l Number l Empty Set l /presentation/author/firstname

XPath: Examples l Find the DG’s name /cern/dg/person/text() l Find all departments /cern/department/@name l

XPath: Examples l Find the DG’s name /cern/dg/person/text() l Find all departments /cern/department/@name l Find all people //person l Find the name of DH of IT <cern> <dg><person>R. Aymar</person></dg> <department name=“PH”> <dh><person>W-D. Schlatter</person></dh> </department> <department name=“IT”> <dh><person>W. von Rueden</person></dh> <group name=“IT-AIS”> <gl><person>R. Martens</person></gl> </group> <group name=“IT-CO”> <gl><person>D. Myers</person></gl> </group> <group name=“IT-IS”> <gl><person>A. Pace</person></gl> </group> </department> </cern> /cern/department[@name=“IT”]/dh/person/text() l Find how many groups has a department where R. Martens works count(//gl/person[starts-with(. , 'R. Martens')]/. . /group) CERN e Business –

XPath: Examples (8) Example: Event Handling System Events XML Subscriptions XPath Check events against

XPath: Examples (8) Example: Event Handling System Events XML Subscriptions XPath Check events against XPath XML Notifications «I want to see all documents for more than 600 CHF» / document [amount > 600] CERN e Business – XPath

XPath: Program Use XPath System. out. println(((XMLDocument)xml). select. Single. Node( "/config/report[@name='Slava']/title/text()"). get. Node. Value());

XPath: Program Use XPath System. out. println(((XMLDocument)xml). select. Single. Node( "/config/report[@name='Slava']/title/text()"). get. Node. Value()); DOM Model Element root = xml. get. Document. Element(); Node child; for (child = root. get. First. Child(); child != null; child = child. get. Next. Sibling()) if (child. get. Node. Name(). equals("report") && ( (Element)child ). get. Attribute("name"). equals("Slava")) break; for (child = ((Element)child). get. First. Child(); child != null; child = child. get. Next. Sibling()) { if (child. get. Node. Name(). equals("title") ) { for (Node child 2 = child. get. First. Child(); child 2 != null; child 2 = child 2. get. Next. Sibling()) if ( child 2 instanceof Text ) System. out. println(( (Text)child 2 ). get. Data(). trim()); CERN e Business – } }

XQuery –XML Query Language l XQuery is SQL for XML – Database independent –

XQuery –XML Query Language l XQuery is SQL for XML – Database independent – Easy to use l Supported by popular RDBMS (Microsoft SQL Server 2005, Oracle 9 i and 10 g) l Based on XPath, supports document sets CERN e Business –

XSLT: XML Transformations Transforms XML to HTML, text or other XML l XSLT 1.

XSLT: XML Transformations Transforms XML to HTML, text or other XML l XSLT 1. 0 (Current), XSLT 2. 0 (Draft) l XSLT is a “Human Interface” to XML l Supported by Web Browsers l XSLT CERN e Business –

XSLT: Simplified Structure XSLT is an XML file l Active usage of XPath expressions

XSLT: Simplified Structure XSLT is an XML file l Active usage of XPath expressions l xsl: stylesheet <html> <body> Evaluate XPath and print value xsl: value-of … xsl: template … xsl: value-of … xsl: apply-templates xsl: template Apply a template to the given element CERN e Business – … </body> <html> Apply templates to other elements

XSLT: Possibilities l l l l Conditions (<xsl: if>) Loops (<xsl: for-each>) Variables (<xsl:

XSLT: Possibilities l l l l Conditions (<xsl: if>) Loops (<xsl: for-each>) Variables (<xsl: variable>) Sorting (<xsl: sort>) Numbering [1. , 1. 1. а, 2. , ] (<xsl: number>) Number formatting (format-number()) Multiple step processing (mode) String manipulations (via XPath) XSLT 2. 0 (Draft) l l l XPath 2. 0 Custom functions Regular expressions Date and time formatting Groupings CERN e Business –

XSLT: Example <xsl: stylesheet version="1. 0" xmlns: xsl="http: //www. w 3. org/1999/XSL/Transform"> <xsl: output

XSLT: Example <xsl: stylesheet version="1. 0" xmlns: xsl="http: //www. w 3. org/1999/XSL/Transform"> <xsl: output method="html" version="1. 0" encoding="UTF-8" indent="yes"/> <xsl: template match="presentation"> <html> <body bgcolor="#FFCCFF"> <h 1><font color="darkblue"><xsl: value-of select="title"/></font></h 1> <h 4><font color="green"><i>Author: <xsl: value-of select="author"/></i></font></h 4> <b>Table of Contents</b><br/> <xsl: apply-templates select="chapter" mode="contents"/> <br/> <xsl: apply-templates select="chapter" mode="normal"/> </body> </html> </xsl: template> <xsl: template match="chapter" mode="normal"> <b>Chapter <xsl: value-of select="@number"/>. <xsl: value-of select="@title"/></b><br/> <i><xsl: value-of select="text()"/></i><br/> </xsl: template> <xsl: template match="chapter" mode="contents"> <xsl: value-of select="@number"/>. <xsl: value-of select="@title"/><br/> </xsl: template> </xsl: stylesheet> CERN e Business –

XSLT: Web “Skins” <aissearchscreen> <head><title>Person Search</title></head> <body> <input type="hidden" name="is. Advanced" value="false"/> <input show="always"

XSLT: Web “Skins” <aissearchscreen> <head><title>Person Search</title></head> <body> <input type="hidden" name="is. Advanced" value="false"/> <input show="always" type="text" label="Keyword" value="titov"/> <input type="checkbox" label="Fuzzy search" value="No"/> <result> <header> <tablecell>Full Name</tablecell> … </header> <row> <tablecell>Maksym TITOV</tablecell> <tablecell>71169</tablecell> <tablecell>40 -3 -C 08</tablecell> … </row> <tablecell>Oleg TITOV</tablecell> <tablecell>EXT</tablecell> … </row> … <rowcount>4</rowcount> </result> </body> </aissearchscreen> CERN e Business –

XSLT: Web “Skins” - 2 XSLT CERN e Business –

XSLT: Web “Skins” - 2 XSLT CERN e Business –

XSLT: User Interfaces CERN Stores Catalog l Data loaded through XML l Data stored

XSLT: User Interfaces CERN Stores Catalog l Data loaded through XML l Data stored in XML l XSLT for data output l 150000 items l +10000 users l ~15 -20 K XML for each page l Custom formatting (through XSLT redefinition) CERN e Business –

XSLT: XML to Text Example: l Automatic code generation XML-description <document> <input type=“person” name=“A”/>

XSLT: XML to Text Example: l Automatic code generation XML-description <document> <input type=“person” name=“A”/> <input type=“number” name=“B”/> … </document> Did you know… that 1 EDH document is: l At least 20 source files (code, HTML templates, resources, SQL, …) l About 250 K of source code CERN e Business – Program Interface Business Logic. . . SQL

XSLT: XML to XML l Generate XML from another XML source l “Configuration files

XSLT: XML to XML l Generate XML from another XML source l “Configuration files update” l XSL: FO CERN e Business –

XSL-FO: Formatting Objects l FO: XML-description of document layout l XSL-FO: XSLT transformation of

XSL-FO: Formatting Objects l FO: XML-description of document layout l XSL-FO: XSLT transformation of XML document to FO document l FO Processor: program that converts the FO definition into a printable format (PDF, PS, . . . ) XML Document <? xml version="1. 0"? > <presentation> <title> XXX </title> </presentation> CERN e Business – FO Document XSL: FO Transformation <fo: root> <fo: page-sequence> <fo: flow>. . . </fo: flow> </fo: page-sequence> </fo: root> PDF Document FO Processor

XSL-FO: Formatting Objects FO has all capabilities of modern text editors: l l l

XSL-FO: Formatting Objects FO has all capabilities of modern text editors: l l l Fonts Pagination Headers and footers Page numbering Odd/even page distinction Margins and intervals Keep paragraphs together Hangout lines Tables Graphics … CERN e Business – FO Processor: Apache FOP

XSL-FO: Example Web Interface e-MAPS XSLT XML XSL: FO Printable Version FOP Processor No

XSL-FO: Example Web Interface e-MAPS XSLT XML XSL: FO Printable Version FOP Processor No extra code required l RTF to XSL: FO converters are good l Can be written by a student l Output format independent l CERN e Business –

XML Editors l Specially designed for XML editing l XML well-formedness and validity check

XML Editors l Specially designed for XML editing l XML well-formedness and validity check l DTD and Schema visual editing l XML generation accordingly to DTD/Schema l Creation and debugging of XSLT and XSL: FO l Visual XSLT editing Example: Altova XML Spy (www. xmlspy. com) - Available from NICE - License can be obtained from the SDT service CERN e Business – XMLSpy 2005

XML: Program Handling l DOM (Document Object Model) – Tree building l SAX -

XML: Program Handling l DOM (Document Object Model) – Tree building l SAX - much faster, DOM – more versatile SAX – Event handling – start. Element() – end. Element() Java, C++: – Apache Xalan – Oracle XML Parser CERN e Business – . . . PERL, . Net: – Built-in support

New Technologies Info. Path 2003 – Corporate system for electronic form handling – XML-based

New Technologies Info. Path 2003 – Corporate system for electronic form handling – XML-based – Business rules defined by XML schema – Data validation using XML schemas l Adobe Intellegent Document Platform – Similar ideas l CERN e Business –

Conclusion «XML is one of the biggest inventions in IT area in the last

Conclusion «XML is one of the biggest inventions in IT area in the last few years. There is a lot of XML applications around the world today, and this amount will grow every year» W 3 C Consortium Web Site: http: //www. w 3 c. org Questions: Rostislav. Titov@cern. ch CERN e Business –