Web Databases and XML CSE 6331 Leonidas Fegaras
Web Databases and XML CSE 6331 © Leonidas Fegaras XML 1
Traditional DB Applications • • • Typically business oriented Large amount of data Data is well-structured, normalized, with predefined schema Large number of concurrent users (transactions) Simple data, simple queries, and simple updates Typically update intensive Small transactions High performance, high availability, scalability Data integrity and security are of major importance Good administrative support, nice GUIs CSE 6331 © Leonidas Fegaras XML 2
Internet Applications Challenges: • Use heterogeneous, complex, hierarchical, fast-evolving, unstructured/semistructured data • Access mostly read-only data • Need 100% availability • Manage millions of users world-wide • Have high-performance requirenments • Are concerned with security (encryption) • Like to customize data in a personalized manner • Expect to gain user’s trust for business-to-consumer transactions. Internet users choose speed and availability over correctness CSE 6331 © Leonidas Fegaras XML 3
Electronic Commerce • Currently, mostly business-to-business (B 2 B) rather than business-to-consumer (B 2 C) interactions • Focus on selling and buying: – Order management – Product catalogs – Product configuration • • Sales and marketing Education and training Service Communities CSE 6331 © Leonidas Fegaras XML 4
Other Web Applications • Web integration – – Heterogeneous data sources and types Thousands of web-accessible data sources Dynamic data Data warehouses • Web publishing – Access different types of content from browsers (eg, email, PDF, HTML, XML) – Structured, dynamic, customized/personalized content – Integration with application – Accessible via major gateways and search engines • Application integration – Transformation between different application data formats (eg, XML, HTML) – Integration of multiple applications CSE 6331 © Leonidas Fegaras XML 5
Current Internet Application Architectures Architecture: • Server-Tier: relational databases and gateways to diverse data sources, such as, files, OLE/DB etc. Use of enterprise servers • Middle-Tier: provides data integration & distribution, query, etc. Consists of a web server and an application server • Client-Tier: mostly a web browser, may use CGI scripts or Java Characteristics: • Customization is achieved at the server site (customer data in a database) with some data at the client site (cookies) • Load balancing is typically hardware based (multiple servers, DNS routers) CSE 6331 © Leonidas Fegaras XML 6
XML (e. Xtensible Markup Language) is a textual language for representing and exchanging data on the web. It is designed to improve the functionality of the Web by providing more flexible and adaptable information identification. • Based on SGML and was developed around 1996. • It is called extensible because it is not a fixed format like HTML (a single, predefined markup language). Instead, XML is actually a metalanguage -- a language for describing other languages -- which lets you design your own customized markup languages for limitless different types of documents. • XML can be untyped (semistructured), but there are standards now for schema conformance (DTD and XML Schema) • Without schema, an XML document is well-formed if it satisfies simple syntactic constraints: proper nesting of start and end tags. CSE 6331 © Leonidas Fegaras XML 7
XML Syntax • XML documents conform to the following grammar: XMLdocument : : = Pi* Element : : = Stag (char | Pi | Element)* Etag Stag : : = '<' Name Atts '>‘ Etag : : = '</' Name '>‘ Pi : : = '<? ' char* '? >‘ Atts : : = ( Name '=' String )* String : : = '"' char* '"‘ • XML consists of tags and text. • Tags come in pairs <date>8/25/2001</date> and must be properly nested: <person> <name>. . . </person> --- valid nesting <person> <name>. . . </person>. . . </name> --- invalid nesting • Text is bounded by tags. PCDATA: parsed character data. For example, <title> The Big Sleep </title> <year> 1935 XML </ year> CSE 6331 © Leonidas Fegaras 8
Representing Data Using XML • Nesting tags can be used to express various structures, such as a record: <person> <name> Ramez Elmasri </name> <tel> (817) 272 -2348 </tel> <email> elmasri@cse. uta. edu </email> </person> • We can represent a list by using the same tag repeatedly: <addresses> <person>. . . </person> <person>. . . </addresses> • An opening tag may contain attributes. These are typically used to describe the content of an element: <author id="2787901">Philip A. Bernstein</author> CSE 6331 © Leonidas Fegaras XML 9
XML structure XML: <person> <name> Ramez Elmasri </name> <tel> (817) 272 -2348 </tel> <email> elmasri@cse. uta. edu </email> </person> is Lisp-like: (person (name “Ramez Elmasri”) (tel “(817) 272 -2348”) (email “elmasri@cse. uta. edu”)) and tree-like: person name Ramez Elmasri CSE 6331 © Leonidas Fegaras XML tel (817) 272 -2348 email elmasri@cse. uta. edu 10
Complete Example <? xml version="1. 0"? > <!DOCTYPE bib SYSTEM "bib. dtd"> <bib> <vendor id="id 0_1"> <name>Amazon</name> <email>webmaster@amazon. com</email> <phone>1 -800 -555 -9999</phone> <book> <title>Unix Network Programming</title> <publisher>Addison Wesley</publisher> <year>1995</year> <author> <firstname>Richard</firstname> <lastname>Stevens</lastname> </author> <price>38. 68</price> </book> <title>An Introduction to Object-Oriented Design</title> <publisher>Addison Wesley</publisher> <year>1996</year> <author> <firstname>Jo</firstname> <lastname>Levin</lastname> </author> <firstname>Harold</firstname> <lastname>Perry</lastname> </author> <price>11. 55</price> </book> … </vendor> </bib> CSE 6331 © Leonidas Fegaras XML 11
DTD: Document Type Definition <? xml encoding="ISO-8859 -1"? > <!ELEMENT bib (vendor)*> <!ELEMENT vendor (name, email, book*)> <!ATTLIST vendor id ID #REQUIRED> <!ELEMENT book (title, publisher? , year? , author+, price)> <!ELEMENT author (firstname? , lastname)> <!ELEMENT name (#PCDATA)> <!ELEMENT email (#PCDATA)> <!ELEMENT title (#PCDATA)> <!ELEMENT publisher (#PCDATA)> <!ELEMENT year (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT price (#PCDATA)> CSE 6331 © Leonidas Fegaras XML 12
Referencing Elements Using IDs/IDrefs <family> <person id="jane" mother="mary" father="john"> <name> Jane Doe </name> </person> <person id="john" children="jane jack"> <name> John Doe </name> <mother/> </person> <person id="mary" children="jane jack"> <name> Mary Doe </name> </person> <person id="jack" mother=”mary" father="john"> <name> Jack Doe </name> </person> </family> CSE 6331 © Leonidas Fegaras XML 13
OODB Schema class Movie ( extent Movies, key title ) class Actor ( extent Actors, key name ) { { attribute string name; relationship set<Movie> acted_In inverse Movie: : casts; attribute int age; attribute set<string> directed; attribute string title; attribute string director; relationship set<Actor> casts inverse Actor: : acted_In; attribute int budget; }; CSE 6331 © Leonidas Fegaras XML 14
In XML … <db> <movie id=“m 1”> <title>Waking Ned Divine</title> <director>Kirk Jones III</director> <cast idrefs=“a 1 a 3”></cast> <budget>100, 000</budget> </movie> <movie id=“m 2”> <title>Dragonheart</title> <director>Rob Cohen</director> <cast idrefs=“a 2 a 9 a 21”></cast> <budget>110, 000</budget> </movie> <movie id=“m 3”> <title>Moondance</title> <director>Dagmar Hirtz</director> <cast idrefs=“a 1 a 8”></cast> <budget>90, 000</budget> </movie> CSE 6331 © Leonidas Fegaras XML <actor id=“a 1”> <name>David Kelly</name> <acted_In idrefs=“m 1 m 3 m 78” > </acted_In> </actor> <actor id=“a 2”> <name>Sean Connery</name> <acted_In idrefs=“m 2 m 9 m 11”> </acted_In> <age>68</age> </actor> <actor id=“a 3”> <name>Ian Bannen</name> <acted_In idrefs=“m 1 m 35”> </acted_In> </actor> : </db> 15
- Slides: 15