Rewriting Nested XML Queries Using Nested Views Nicola
- Slides: 50
Rewriting Nested XML Queries Using Nested Views Nicola Onose joint work with Alin Deutsch, Yannis Papakonstantinou, Emiran Curtmola University of California, San Diego 1
INTRO The problem query result Can we answer Q using only view access paths? the query Q doc. V 1 doc. Vn …Vn Input XML data • views defined by queries V 1, …, Vn and materialized as doc. V 1, …, doc. Vn 2
INTRO The problem query result the rewriting query R the query Q doc. V 1 doc. Vn ? …Vn Input XML data • views defined by queries V 1, …, Vn and materialized as doc. V 1, …, doc. Vn • is there a query R such that R(V 1(Input) … Vn(Input)) = Q(Input)? 3
INTRO Motivation: caching & indexes query result the rewriting query R the query Q doc. V 1 … Vn doc. Vn materialized views, faster to access than the original input Input XML data • caching: answer new queries using results of previously answered ones • (partial) indexes: materialized references to frequently accessed parts of the data 4
INTRO Motivation: security views query result the rewriting query R the query Q doc. V 1 doc. Vn ? …Vn Input XML data security views (permitted queries) • checking existence of R security problem: allow only queries that can be expressed in terms of certain permitted queries, the security views 5
INTRO Motivation: data integration query result the rewriting query R the query Q source 1 sourcen … Virtual global DB local/global mappings expressed as views • data integration: given a query expressed in global terms, rewrite it using the descriptions of the particular sources 6
INTRO Rewritings enabled by pattern matching • • Previous literature: find parts of the query that are precomputed by the views. How to decide that: match the patterns of the views into the query – In the relational case, patterns were: tableaux, conjunctive queries – For XPath: tree patterns • Matching XML queries? – (until recently) no pattern based description of XQuery semantics – Nested XML Tableaux (NEXT) come to fill the gap The NEXT Logical Framework for XQuery, A. Deutsch et al. , VLDB’ 04 7
INTRO Scope of Our Approach Tree Patterns cover XPath NEXT extend Tree. Patterns with: - nested for-loops - joins - element construction etc. NEXT+ extends NEXT to the whole XQuery language, including: - function calls - universal quantification - disjunction, negation etc. • Nested XML Tableaux (NEXT) extend previous work on tree patterns. • NEXT+ extends NEXT to the whole XQuery. 8
INTRO Scope of Our Approach Tree Patterns cover XPath NEXT extend Tree. Patterns with: - nested for-loops - joins - element construction etc. NEXT+ extends NEXT to the whole XQuery language, including: - function calls - universal quantification - disjunction, negation etc. soundness guarantee: if a rewriting is found, it is equivalent to the original query completeness guarantee: if a rewriting exists, we will find one 9
INTRO Rewriting using views example View V: group authors by title Query Q: group titles by author for each book, output its title and the list of authors for each distinct author, output the titles of his/her books The result of the view is cached and has faster access time than getting the data directly from the source bib. xml book Rewriting R title Data on the Web author scan the view and create an entry for each distinct author in the view output; add to it all the titles of the respective author 10
INTRO Rewriting using views example View V: group authors by title for $b 1 in $doc//book, $t 1 in $b 1/title return <authorlist> {$t 1, $b 1/author} </authorlist> Query Q: group titles by author for each distinct author, output the titles of his/her books Previous work captures: - XPath navigation Rewriting R scan the view and create an entry for each distinct author in the view output; add to it all the titles of the respective author 11
INTRO Rewriting using views example View V: group authors by title for $b 1 in $doc//book, $t 1 in $b 1/title return <authorlist> {$t 1, $b 1/author} </authorlist> Previous work captures: - XPath navigation Query Q: group titles by author for $a in distinct-values($doc//book[title]/author) return <bibentry> { $a, for $b in $doc//book, $t in $b/title where some $a 1 in $b/author satisfies $a 1 eq $a return $t } </bibentry> NEXT captures: - XPath navigation - nested for loops - joins - element construction etc. 12
INTRO Rewriting using views example View V: group authors by title for $b 1 in $doc//book, $t 1 in $b 1/title return <authorlist> {$t 1, $b 1/author} </authorlist> Previous work captures: - XPath navigation Query Q: group titles by author for $a in distinct-values($doc//book[title]/author) return <bibentry> { $a, for $b in $doc//book, $t in $b/title where some $a 1 in $b/author satisfies $a 1 eq $a return $t } </bibentry> NEXT captures: - XPath navigation - nested for loops - joins - element construction etc. 13
INTRO Rewriting using views example View V: group authors by title for $b 1 in $doc//book, $t 1 in $b 1/title return <authorlist> {$t 1, $b 1/author} </authorlist> Previous work captures: - XPath navigation Query Q: group titles by author for $a in distinct-values($doc//book[title]/author) return <bibentry> { $a, for $b in $doc//book, $t in $b/title where some $a 1 in $b/author satisfies $a 1 eq $a return $t } </bibentry> NEXT captures: - XPath navigation - nested for loops - joins - element construction etc. 14
INTRO Rewriting using views example View V: group authors by title for $b 1 in $doc//book, $t 1 in $b 1/title return <authorlist> {$t 1, $b 1/author} </authorlist> bib. xml book title Data on the Web Query Q: group titles by author for $a in distinct-values($doc//book[title]/author) return <bibentry> { $a, for $b in $doc//book, $t in $b/title where some $a 1 in $b/author satisfies $a 1 eq $a return $t bound to the } </bibentry> root of the view output author Rewriting R for $a 3 in distinct-values($doc. V/authorlist[title]/author) return <bibentry> { $a 3, navigate for $p in $doc. V/authorlist, inside the $t 3 in $p/title view output where some $a 4 in $p/author satisfies $a 4 eq $a 3 return $t 3 } </bibentry> 15
Outline • • • NEXT (NEsted XML Tableaux) Rewriting Algorithm and Extensions Experiments Previous Work Conclusions 16
Outline • • • NEXT (NEsted XML Tableaux) Rewriting Algorithm and Extensions Experiments Previous Work Conclusions 17
NEXT Architecture of the NEXT framework XQuery query and views Normalization patterns Nested XML Tableaux (NEXT) VLDB’ 04 Logical Optimization Minimization Rewriting Using Views presented at this conference Nested XML Tableaux (NEXT) Logical Plan Translate to XQuery To Any XQuery Processor Plan Execution Engine 18
NEXT The need for normalization XQuery query and views Normalization Nested XML Tableaux (NEXT) for $a in distinct-values($doc//book[title]/author) return <bibentry> { $a, for $b in $doc//book, $t in $b/title where some $a 1 in $b/author satisfies $a 1 eq $a return $t } </bibentry> 19
NEXT Normalization into NEXT XQuery query and views Normalization Nested XML Tableaux (NEXT) for $a in distinct-values($doc//book[title]/author) return <bibentry> { $a, for $b in $doc//book, $t in $b/title where some $a 1 in $b/author satisfies $a 1 eq $a return $t } </bibentry> for $a in distinct-values($doc//book[title]/author) return <bibentry> { $a, for $b in $doc//book, $a 1 in $b/author, $t in $b/title where $a 1 eq $a return $t } </bibentry> 20
NEXT Normalization into NEXT XQuery query and views Normalization Nested XML Tableaux (NEXT) for $a in distinct-values($doc//book[title]/author) return <bibentry> { $a, for $b in $doc//book, $t in $b/title where some $a 1 in $b/author satisfies $a 1 eq $a return $t } </bibentry> for $a in distinct-values($doc//book[title]/author) return <bibentry> { $a, for $b in $doc//book, cardinality $a 1 in $b/author, ? $t in $b/title where $a 1 eq $a groupby [$b], [$t] return $t } </bibentry> … NEXT 21
NEXT Patterns • alternative way of defining the XQuery semantics (but equivalent to the standard), given by matching patterns View V: B 1(V) for $b 1 in $doc//book, $t 1 in $b 1/title groupby [$b 1], [$t 1] return <authorlist> {$t 1, for $a 2 in $b 1/author B 2(V) groupby [$a 2] return $a 2 } </authorlist> • graphical representation of NEXT: nested patterns B 1(V) [$b 1], [$t 1] B 2(V) [$a 2] <authorlist> $t 1, B 2(V) </authorlist> $a 2 $doc book($b 1) title($t 1) forest of tree patterns book($b 1) author($a 2) 22
NEXT Patterns • alternative way of defining the XQuery semantics (but equivalent to the standard), given by matching patterns View V: B 1(V) for $b 1 in $doc//book, $t 1 in $b 1/title groupby [$b 1], [$t 1] return <authorlist> {$t 1, for $a 2 in $b 1/author B 2(V) groupby [$a 2] return $a 2 } </authorlist> • graphical representation of NEXT: nested patterns B 1(V) [$b 1], [$t 1] B 2(V) [$a 2] <authorlist> $t 1, B 2(V) </authorlist> $a 2 $doc book($b 1) descendant navigation title($t 1) book($b 1) author($a 2) child navigation 23
NEXT Patterns • alternative way of defining the XQuery semantics (but equivalent to the standard), given by matching patterns View V: B 1(V) for $b 1 in $doc//book, $t 1 in $b 1/title groupby [$b 1], [$t 1] return <authorlist> {$t 1, for $a 2 in $b 1/author B 2(V) groupby [$a 2] return $a 2 } </authorlist> • graphical representation of NEXT: nested patterns B 1(V) [$b 1], [$t 1] B 2(V) [$a 2] <authorlist> $t 1, B 2(V) </authorlist> $a 2 $doc book($b 1) title($t 1) return function book($b 1) author($a 2) 24
NEXT Patterns • alternative way of defining the XQuery semantics (but equivalent to the standard), given by matching patterns View V: B 1(V) for $b 1 in $doc//book, $t 1 in $b 1/title groupby [$b 1], [$t 1] return <authorlist> {$t 1, for $a 2 in $b 1/author B 2(V) groupby [$a 2] return $a 2 } </authorlist> • graphical representation of NEXT: nested patterns B 1(V) [$b 1], [$t 1] B 2(V) [$a 2] <authorlist> $t 1, B 2(V) </authorlist> $a 2 $doc book($b 1) title($t 1) list of groupby variable s book($b 1) author($a 2) 25
NEXT Patterns • alternative way of defining the XQuery semantics (but equivalent to the standard), given by matching patterns View V: B 1(V) Query Q: for $b 1 in $doc//book, $t 1 in $b 1/title groupby [$b 1], [$t 1] return <authorlist> {$t 1, for $a 2 in $b 1/author B 2(V) groupby [$a 2] return $a 2 } </authorlist> for $b 0 in $doc//book, $t 0 in $b 0/title, $a in $b 0/author groupby $a return <bibentry> { $a, for $b in $doc//book, $a 1 in $b/author, $t in $b/title B 2(Q) where $a 1 eq $a groupby [$b], [$t] return $t } </bibentry> B 1(Q) • graphical representation of NEXT: nested patterns B 1(V) [$b 1], [$t 1] B 2(V) [$a 2] <authorlist> $t 1, B 2(V) </authorlist> $a 2 $doc book($b 1) book($b 0) title($t 1) title($t 0) author($a) <bibentry> $a, B 2(Q) </bibentry> B 1(Q) $a $doc book($b 1) book($b) author($a 2) title($t) author($a 1) $t B 2(Q) [$b], [$t] 26
NEXT Patterns • alternative way of defining the XQuery semantics (but equivalent to the standard), given by matching patterns View V: Query Q: for $b 1 in $doc//book, $t 1 in $b 1/title groupby [$b 1], [$t 1] return <authorlist> {$t 1, for $a 2 in $b 1/author groupby [$a 2] return $a 2 } </authorlist> for $b 0 in $doc//book, $t 0 in $b 0/title, $a in $b 0/author groupby $a return <bibentry> { $a, for $b in $doc//book, $a 1 in $b/author, $t in $b/title where $a 1 eq $a groupby [$b], [$t] return $t } </bibentry> • graphical representation of NEXT: nested patterns B 1(V) [$b 1], [$t 1] B 2(V) [$a 2] <authorlist> $t 1, B 2(V) </authorlist> $a 2 $doc book($b 1) book($b 0) title($t 1) title($t 0) author($a) <bibentry> $a, B 2(Q) </bibentry> B 1(Q) $a $doc book($b 1) book($b) author($a 2) title($t) author($a 1) $t B 2(Q) [$b], [$t] 27
Outline • • • NEXT (NEsted XML Tableaux) Rewriting Algorithm and Extensions Experiments Previous Work Conclusions 28
NEXT Architecture of the NEXT framework XQuery query and views Normalization Nested XML Tableaux (NEXT) Logical Optimization Minimization Rewriting Using Views rewriting algorith m Nested XML Tableaux (NEXT) Logical Plan Translate to XQuery Independent XQuery Processor Plan Execution Engine 29
REWRITING ALGORITHM Overview of the Rewriting Algorithm Input: query Q, views V 1. detect alternative access paths towards the variable bindings through the views Access paths through V Query Q 2. build a candidate rewriting R that uses only the access paths from phase 1. Access paths (candidate rewriting) 3. check that R is equivalent to Q 30
REWRITING ALGORITHM Step 1: Detect View Access Paths • access paths: ways of accessing data using the view • identify matching subqueries (extended tree pattern matching) • find a mapping and add navigation from the view return <authorlist> $t 1, B 2(V) </authorlist> $a 2 $doc book($b 1) book($b 0) title($t 1) title($t 0) author($a) $doc book($b 1) book($b) author($a 2) title($t) view author($a 1) query body 31
REWRITING ALGORITHM Step 1: Detect View Access Paths • access paths: ways of accessing data using the view • identify matching subqueries (extended tree pattern matching) • find a mapping and add navigation from the view return <authorlist> $t 1, B 2(V) </authorlist> $a 2 $doc. V book($b 1) book($b 0) authorlist($p 0) title($t 1) title($t 0) author($a) title($t 2) $doc book($b 1) book($b) author($a 2) title($t) view author($a 1) query body extended query 32
REWRITING ALGORITHM Step 1: Detect View Access Paths • access paths: ways of accessing data using the view • identify matching subqueries (extended tree pattern matching) • find a mapping and add navigation from the view return • and another one… <authorlist> $t 1, B 2(V) </authorlist> $a 2 $doc. V book($b 1) book($b 0) authorlist($p 0) title($t 1) title($t 0) author($a) title($t 2) author($a 3) $doc book($b 1) book($b) author($a 2) title($t) view author($a 1) query body extended query 33
REWRITING ALGORITHM Step 1: Detect View Access Paths • access paths: ways of accessing data using the view • identify matching subqueries (extended tree pattern matching) • find a mapping and add navigation from the view return • and another one… • computing all such mappings query extension that uses only view access paths <authorlist> $t 1, B 2(V) </authorlist> $a 2 $doc. V book($b 1) book($b 0) authorlist($p 0) title($t 1) title($t 0) author($a) book($b 1) title($t 2) author($a 3) $doc. V book($b) authorlist($p) author($a 2) title($t) view author($a 1) query body title($t 3) author($a 4) query extension extended query 34
REWRITING ALGORITHM Step 2: Candidate Rewriting • same return function as the initial query, but with other variable bindings original query B 1(Q) $a B 2(Q) [$b], [$t] <bibentry> $a, B 2(Q) </bibentry> $doc. V book($b 0) authorlist($p 0) title($t 0) author($a) $t title($t) title($t 2) author($a 3) $doc. V book($b) authorlist($p) author($a 1) title($t 3) author($a 4) extended query 35
REWRITING ALGORITHM Step 2: Candidate Rewriting • same return function as the initial query, but with other variable bindings candidate rewriting original query B 1(Q) $a $doc. V book($b 0) authorlist($p 0) title($t 0) author($a) B 2(Q) title($t 2) author($a 3) $doc. V book($b) authorlist($p) [$b], [$t] title($t) author($a 1) title($t 3) author($a 4) <bibentry> $a 3, B 1(R) B 2(R) </bibentry> $a 3 $t 3 B 2(R) [$t 3] 36
REWRITING ALGORITHM Step 3: Equivalence Check • check that R ≡ Q: containment mappings defined on the tree of query blocks • and then (optional step) translate back to XQuery: Rewriting R: $doc. V authorlist($p 0) title($t 2) author($a 3) <bibentry> $a 3, B 1(R) B 2(R) </bibentry> $a 3 $doc. V authorlist($p) title($t 3) author($a 4) $t 3 B 2(R) [$t 3] for $a 3 in distinct-values ($doc. V/authorlist[title]/author) return <bibentry> { $a 3, for $p in $doc. V/authorlist, $t 3 in $p/title where some $a 4 in $p/author satisfies $a 4 eq $a 3 return $p } </bibentry> 37
REWRITING ALGORITHM Under the Hood • two types of equality: by value and by node id – mappings must take it into consideration – the groupby clause also • XQuery results have order. We consider rewritings that: – do not respect order (for DB-centric applications) – respect order (for text-centric applications) • for rewritings that respect order: look for an ordering of the view access paths that preserves the original query order (details in the paper) 38
REWRITING ALGORITHM Extensions to NEXT • Extended NEXT to NEXT+: – extend the pattern based representation to the whole XQuery – functions and other expressions (negation, disjunction, aggregates etc. ) modeled as uninterpreted functions • Extended the algorithm to use NEXT+: need to identify maximal subparts that are pure NEXT blocks for $x in $doc/book where count( for $a in $x/author where $x/price eq 60 groupby [$a] return $a ) eq count( …) groupby $x return $x 39
REWRITING ALGORITHM Extensions to NEXT • Extended NEXT to NEXT+: – extend the pattern based representation to the whole XQuery – functions and other expressions (negation, disjunction, aggregates etc. ) modeled as uninterpreted functions • Extended the algorithm to use NEXT+: need to identify maximal subparts that are pure NEXT blocks. for $x in $doc/book where count( for $a in $x/author where $x/price eq 60 groupby [$a] return $a ) eq count( …) groupby $x return $x rewrite outer block, disregarding function calls rewrite blocks inside function arguments, with free variables bound in upper blocks 40
REWRITING ALGORITHM Formal Guarantees • The rewriting algorithm is sound • and complete for a large fragment of XQuery (the one that can be translated into NEXT), without order – Completeness means that if there any rewritings, we are guaranteed to find at least one. • There is no hope for completeness for – ordered rewritings: equivalence is undecidable – expressions beyond NEXT: negation and universal quantification also lead to undecidability In these cases, our algorithm is a best effort approach, with guaranteed soundness. 41
REWRITING ALGORITHM Implementation (considerations) • completeness guarantees a price to pay: compute mappings between view and query patterns • in general, NP-complete, but PTIME if the patterns are trees (no equality conditions): based on M. Yanakakis, Algorithms for acyclic database schemes, 1981 • our goal: design an implementation whose running time is polynomial for pure tree patterns and degrades progressively with the number of added joins 42
REWRITING ALGORITHM Implementation in practice. . … V compile mappings Q compile evaluate query plan (SPJ) XML instance • when computing the query plan, apply techniques from the Yanakakis algorithm: push projections & selections • performance degrades with the number of equalities: the problem is NP-complete in the width of the view pattern (see the paper) and in PTIME when no join equalities. 43
Outline • • • NEXT (NEsted XML Tableaux) Rewriting Algorithm and Extensions Experiments Previous Work Conclusions 44
EXPERIMENTS Experiments: Design • The running time of the algorithm increases with: – number of nested levels: mappings are block by block – size of the pattern: # of mapped and target nodes increases – number of views: more patterns to match • Our experiments measured how the algorithm scales with these parameters. • We designed a configuration where we generated queries and views of increasing size and nesting depth. 45
EXPERIMENTS Experiments: Implementation Queries & views with similar basic patterns, in a vertical chain of blocks: block Bk a block Bk+1 a $doc mk mk c 1 a c 2 $doc mk+1 c 1 a c 2 …. . $doc basic pattern mk …. . a ci Irrelevant views don’t matter (can be quickly discarded). We create only relevant views (with mappings into query): – split the query recursively into fragments = views – make them overlap on basic patterns 46
EXPERIMENTS Experiments: Good Scalability d = depth (# of nested levels in a query) b = breadth (# of basic patterns in a block) 1. 25 s for d=16, b=16 and 128 views 47
Previous work • rewriting XPath queries using XPath views Rewriting XPath Queries Using Materialized Views W. Xu et al. VLDB 2005 • rewriting XQuery using XPath views A Framework for Using Materialized XPath Views in XML Query Processing A. Balmin et al. VLDB 2004 • rewrite an XQuery with only one XQuery view that has to contain the query ACE-XQ: A Cach. E-aware XQuery Answering System L. Chen et al. Web. DB 2002 • caching common XQuery subexpressions Implementing Memoization in a Streaming XQuery Processor Y. Diao et al. XSym 2004 48
Conclusions • NEXT is a pattern based representation that describes what the query result is and not how it is computed more opportunities for semantic optimizations • extensible to all of XQuery, using NEXT+ • rewriting using views algorithm – – sound for the whole language complete for a large fragment of XQuery good scalability independent of the underlying algebra of the query processor 49
Online Demo http: //db. ucsd. edu/reform 50
- Using subqueries to solve queries
- Xml elements must be properly nested
- Read xml string in c# using linq
- Every pot has a lid conditional statement
- Intercept form of a parabola
- Rewriting history
- Expression is undefined
- Rewriting percent expressions
- Justinian i was famous for rewriting the
- Rewriting a universal conditional statement
- Direct object pronouns. spanish
- Servlet url rewriting
- Rewriting radical expressions
- Rewriting universal conditional statement examples
- Rewriting goldilocks and the three bears
- Block nested loop join
- J query
- Sql queries for banking database
- Join ordering in fragment queries
- Thank you any queries
- Codeapillar troubleshooting
- Data manipulation language in sql
- Sql queries for insert update and delete
- Facts and dimensions example
- Eyegaze
- Conjunctive queries
- Tpch queries
- Complex sql join queries
- Basic ir
- Suggestions and queries
- Answering my queries
- Multirelation queries
- Texas rrc online query
- For any queries
- Basic retrieval queries in sql
- Standing queries
- Jafar shah
- Wideworldimporters
- Stefano grazioli
- Wild card queries in information retrieval
- Texas railroad commission online queries
- Elibrary symbiosis
- Hotel.hotelno=room.hotelno(hotel room)
- Action queries in access
- Nicola cezzi
- Nicola kershaw
- Sifat silvika kayu
- Nicola bowman
- Nicola riley
- Nicki ward
- Nicola sheldon