Integrating a Heterogeneous Environment using XML Sandeepan Banerjee

Integrating a Heterogeneous Environment using XML Sandeepan Banerjee Director, Oracle Server Technologies

The Problem Synthesized Information Ÿ Multiple domain-specific applications – Manufacturing, Inventory, Supply Chain, Financial, … Ÿ Information is trapped within these applications Financial Application CRM Application Excel Files on Disk E-Mail and Document Contract Management Repositories Application How does an organization get a consolidated view of its information – in real time ?

Technical Challenges Ÿ Domain-specific information – Replication does not make sense Ÿ Independent operation of applications – Access to information has to be in real-time Ÿ Different access method for each application – Each application has its own protocol and access method What architecture can best accommodate my present and future needs ? – – – Complexity: Avoid the n by m matrix Flexibility: Add new sources easily Time to market: Within days, not months

What is XML-based Data Integration? XML-based Data Integration or Enterprise Information Integration (EII): Ÿ Create aggregated views using XQuery Ÿ Get information from diverse sources in XML Ÿ Consume synthesized information

XML Data Integration Example XQuery Engine Synthesized Information Order Tracking J 2 EETM CA EIS JDBC Parts Inventory Database HTTP Shipment Tracking Web Service

Technologies Involved Ÿ Why XML? – Different data formats Ÿ Why XQuery ? – Declarative way to query XML documents Ÿ Why J 2 EETM ? – Standards-based infrastructure platform Ÿ Why XML Database ? – – – Native XML storage XML data management Performance optimizations XQuery Engine XML Database J 2 EETM Platform XML-based Data Integration

Comparison with Existing Technologies Difference Similarity Application Integration More about data pumping Involves data and synchronization adapters, data among systems translation and transformation Data Warehousing Explicit ETL steps required; large data volume; batch loading, not real-time Optional cache pre-population step similar to the “loading” step SQL & relation-based vs. XQuery & XML-based Query-based Traditional Report Generation Cannot handle nonrelational sources

XQuery Status • XQuery is emerging as the consensus ‘native’ query language for XML • Expected W 3 C Recommendation in late 2005 • Oracle 10 g. R 2 is the first mainstream commercial database release to support XQuery • Plan to release under an event in 10 g. R 2

XQuery Example Assume a document – emp. xml <empset> <emp empno=“ 21” ename=“SCOTT” salary=“ 120000”/> <emp empno=“ 22” ename=“JONES” salary=“ 344000”/> </empset> To get the names of employees with salary > 200000 for $i in document(‘emp. xml’)/empset let $j = 200000 where $i/@salary > $j return $i/@ename Result (attribute node) JONES

Another Example Customer/Address/Zip Ship. To/Address/Zip Supplier/Location/Address. For. Tax. Calculation/Zip Customer/Address/Work/City/Zip Customer/Address/Home/City/Zip Shipper/Drop. Off. Location/US/California/SFO/Zip + any other zip nested to any unanticipated level count ( for $i in doc("contacts. xml")/Contact where $i//zip eq 94065 return $i)

Example III: Auction/Bids Ÿ Say we have 3 heterogeneous data sources that yield their contents as XML – Item Description comes from a Filesystem – User Information comes from LDAP/DB – Bid Information comes from an App Server -- For all bicycles, list the item number, description, highest bid (if any), ordered by item no. for $i in doc("items. xml")//item_tuple let $b : = doc("bids. xml")//bid_tuple[itemno = $i/itemno] where contains($i/description, "Bicycle") order by $i/itemno return <item_tuple> { $i/itemno} { $i/description } <high_bid>{ max($b/bid) }</high_bid> </item_tuple>

XQuery and SQL Ÿ Existing relational applications will continue to use SQL, and Oracle will remain the industry’s best implementation of SQL Ÿ New applications based on XML will use XQuery, and Oracle will be the industry’s best implementation of XQuery Ÿ Oracle will support XQuery both in the database and in the mid-tier – Use the mid-tier engine when you want to query nondatabase sources Ÿ Intelligent ‘query pushdown’ from mid-tier to db when possible Ÿ Can mix-and-match XQuery and SQL in same query

XQuery is different from SQL Ÿ Navigation-oriented (using Xpath expressions) Ÿ Different type system (XMLSchema based simple types) Ÿ Identity-based (XML Node identities and document order) Ÿ Namespace aware name-resolution (functions, variables, element creation) Ÿ XML-Item based vs Row-based Ÿ Results are heterogeneous sequences Ÿ Does not have all SQL extensions (e. g. data warehousing etc. . )

XQuery Mid-tier Architecture Other Data sources XQuery. X XPath XQJ API Driver XQuery Java Engine DB Drivers JDBC Driver Java XMLType SQL + XQuery or XQuery. X

XQuery DB Architecture XQJ API XQuery Aware SQL Engine Mid-Tier XQuery Engine User SQL Compiler XQuery XMLQuery, XMLTable Compiler XQuery Type check Normalization SQLX/XPath XQuery Parser Optimizer, Execution Engine

D E M O N S T R A T I O N XML Query

Oracle XML Database (XML DB) Ÿ Native XML storage – Available since Oracle Database Release 9. 2 Ÿ Inherits RDBMS features: Security, Transaction, … Ÿ XML-specific features – XML indexing, XPath & XSLT support, XML schema validation, XML partial update Ÿ Supports SQL/XML – Allows blending relational and XML data operations

Leveraging Oracle XML DB Ÿ XML DB can be an XQuery source – – Can define XML views of relational data XQuery engine can rewrite query into SQL/XML Ÿ XML DB could also be used for caching – Efficient storage & indexing for large data sets Ÿ Can leverage security framework of XML DB – For both source and cache

Datasources Ÿ Databases – Relational+ XML Views, Object-Relational, CLOB, Compact XML (future) Ÿ Mid-tier sources – files, cache, JCA datasources Ÿ Bind (an existing DOM) Ÿ xmldatasrc – Oracle language addition Ÿ Datasource API – – initialize describe execute Fetch

i. AS XDS Architecture Applications using XDS Query XDS Client API’s JSP Tags EJB e. g. Portals, Reports etc Builder Web Service XML Data Synthesis Tool J 2 EE Security Framework Cached XML Data Source Meta-data XDS Cache Security XQuery Result Repository XQuery Subsystem XDS XQ 4 J/JXQI Oracle Enterprise Manager Caching XQuery Engine Service XMLData. Source modules Stored Query Cached XML Data Source In- XML Data source adaptors RDBMS Web Services Files HTTP Web Cache CCI-XML Java Functions J 2 CA EAI JMS Oracle SAP Apps XML DB Mem ory JCache File system

Example – XDS usage Ÿ User registers webservice as datasource Ÿ XDS creates an XQuery module automatically Ÿ User Query for querying webservice import module namespace wss=“datasrc/stockws”; for $i in wss: getcompanies() return wss: get_stock_price($i/name) wss – namespace prefix for the loaded module JNDI lookup to get datasource implementation XDS adapters implement datasource

All Your Searches! Ÿ Ÿ Search with any language – SQL/XQuery/XSL Search anywhere - mid tier or backend Search anything -any XML/relational content Search everything – – XML visualization of all data (backend) XML based adapters provide XML content for all data in midtier (XDS) Ÿ Search any form – text based/structured Ÿ Search any size - Scalable solution Ÿ Search any time - Unbreakable solution

More Information Ÿ XML in general – http: //www. oracle. com/technology/tech/xml/index. html Ÿ XML Query – http: //www. oracle. com/technology/tech/xml/xquery/index. html Ÿ XML DB – http: //www. oracle. com/technology/tech/xmldb/index. html

Q& A QUESTIONS ANSWERS

- Slides: 25