Processing XML data using a relational database Schema

Processing XML data using a relational database: Schema. Based XML Storage By Khang Nguyen n. Based on the paper of Rajasekar Krishnamurthy n

Three main points on the query translation problem n n n Developing query translation algorithms for the case when the XML Schema and/or the XML query may be recursive. Designing algorithms that make better use of the XML-to-Relational mapping information during the query translation process. Studying the interaction between the two sub problems: choosing a good relational decomposition for storing the XML data and choosing a query translation algorithm.

Recursive Schemas and Recursive Queries n n Has been a lot of work on alternative relational decompositions for XML data, not much on query translation algorithms. [Choi 02] out of 60 XML schemas analyzed, 35 were recursive. Recursive XML schemas are important. Descendant operator (//) specifies ancestordescendant relationships. i. e. , the query //section/title is a recursive query.

Recursive Schemas and Recursive Queries (Cont. ) – Interesting Issues n n How do we translate path expression queries over arbitrary XML-to-Relational mappings into equivalent SQL queries? Is the support for recursion in SQL 3 sufficient for supporting path expression queries over arbitrary XML-to-Relational mapping? Are there any issues in the translation process when the XML schema is non-recursive? Does XPath semantics introduce any interesting challenges?

Mapping-aware Query Translation Algorithm

Mapping-aware Query Translation Algorithm (Cont. ) n n Query: retrieve all the top-level section titles. XQuery: n n SQL query: n n for $title in document(*)/book/section/title Select S. title From Book B, Section S Where B. id = S. parentid and S. parentcode = 1 Mapping-aware algorithm query: n n n Select title From Section Where parentcode = 1

Are the two sub problems independent? n n n One is to pick a good relational decomposition and the other is to translate queries over this XML-to-Relational mapping. The two sub problems can’t be solved in isolation. There exist query translation algorithms T 1 and T 2, and relational decomposition D 1 and D 2. If we use T 1, then D 1 is better than D 2 while with T 2, then D 2 is better than D 1.

Yes, the two sub problems dependent

Yes, the two sub problems are dependent (Cont. ) n n n On the 100 MB XMark dataset [11], we noticed that XQ 2 fg was about three times faster than XQ 2 fp. So, we see that for query Q, with algorithm Naive. Translation, the fully partitioned strategy is better, whereas with algorithm Multiple. Scan, the fully grouped strategy is better. As a result, the quality of a decomposition is closely related to the query translation algorithm used.