XML Partial queries Query processing Query evaluation Query
- Slides: 101
ΕΥΕΛΙΚΤΗ ΑΝΑΖΗΤΗΣΗ ΣΕ ΔΕΔΟΜΕΝΑ XML Partial queries Query processing Query evaluation Query containment Experiments Conclusion
Difficulties on Querying XML Data Creta Hotels Athens City Creta Island Athens Creta Location Island City Center Poros Chania Heraklio 3
Difficulties on Querying XML Data Search problem Name: Xiaoying Wu Place: Athens Center, Heraklio Purpose: Sightseeing Parthenon (438 BC) Problem: Creta structural difference Phaistos’ Disk (1700 BC) Hotels Athens City Creta Island Athens Creta Location Island City Center Poros Chania Heraklio 4
Difficulties on Querying XML Data Search problem Name: Theodore Dalamagas Place: Islands Purpose: Sea sports Windsurf Problem: structural inconsistency Jet ski Creta Hotels City Athens Creta Island Athens Creta Location Island City Center Poros Chania Heraklio 5
Difficulties on Querying XML Data Search problem Name: Dimitri Theodoratos Place: Heraklio Purpose: HDMS Conference Problem: HDMS 2008 unknown structure Creta Athens Hotels City Creta Island Athens Creta Location Island City Center Poros Chania Heraklio 6
Difficulties on Querying XML Data Search problem Name: Stefanos Souldatos Place: Any island Purpose: Escape from Ph. D! Problem: multiple sources Creta 1400 islands the. Hotel. gr hotels. gr holidays. gr 7
Difficulties on Querying XML Data Can we use existing query languages (XPath, XQuery) to express our queries? Can we use existing techniques to evaluate our queries? Creta Hotels Athens City Creta Island Athens Creta Location Island City Center Poros Chania Heraklio 8
Partial Queries in XPath 2 3 Hotels City Athens Hotels Path queries 1 0% structure 100% 4 Tree-pattern queries 5 Hotels City Athens Island 1. //Hotels[descendant-or-self: : *[ancestor-or-self: : City][ancestor-or-self: : Athens]] 2. //Hotels[/City[descendant-or-self: : *[ancestor-or-self: : Athens]]] 3. //Hotels[/City//Athens] 4. //Hotels[/City[descendant-or-self: : *[ancestor-or-self: : Athens]]][//City [descendant-or-self: : *[ancestor-or-self: : Island]]] 9 5. //Hotels[/City//Athens][/City//Island]
Partial Queries r a c c b a d r root node (optional) a query node labelled by “a” child relationship descendant relationship 10
Conclusions (up to now) n n n Need for queries with partial structure We introduce partial queries Partial queries can be expressed in XPath 11
ΕΥΕΛΙΚΤΗ ΑΝΑΖΗΤΗΣΗ ΣΕ ΔΕΔΟΜΕΝΑ XML Partial queries Query processing Query evaluation Query containment Experiments Conclusion
Query Processing r r a c c b a QUERY PROCESSING d partial path query c a a b d QUERY EVALUATION partial path query in canonical form 13
Query Processing 1. 2. 3. 4. Full form Satisfiability Redundant nodes Canonical form r a c c b a d 14
Query Processing 1. 2. 3. 4. Full form Satisfiability Redundant nodes Canonical form IR 1 r a c c b a d INFERENCE RULES (IR 1) |- r//ai (IR 2) x/y |- x//y (IR 3) x//y, y//z |- x//z (IR 4) x/ai, x//bj |- ai//bj (IR 5) ai/x, bj//x |- bj//ai (IR 6) x/y, y/w, x//z, z//w |- x/z (IR 7) x/y, x//z, w//y |- x/z (IR 8) x/y, y/w, x/z |- z/w (IR 9) x//y, y//w, x/z |- z//w (IR 10) x/y, w/z |- x/z (IR 11) x//y, w//z |- x//z (IR 12) x/y, y/w, z/w |- x/z (IR 13) x//y, y//w, z/w |- x//z x, y, z, w: query nodes ai/bj: nodes labelled by a/b 15
Query Processing 1. 2. 3. 4. Full form Satisfiability Redundant nodes Canonical form IR 4 r a c c b a d INFERENCE RULES (IR 1) |- r//ai (IR 2) x/y |- x//y (IR 3) x//y, y//z |- x//z (IR 4) x/ai, x//bj |- ai//bj (IR 5) ai/x, bj//x |- bj//ai (IR 6) x/y, y/w, x//z, z//w |- x/z (IR 7) x/y, x//z, w//y |- x/z (IR 8) x/y, y/w, x/z |- z/w (IR 9) x//y, y//w, x/z |- z//w (IR 10) x/y, w/z |- x/z (IR 11) x//y, w//z |- x//z (IR 12) x/y, y/w, z/w |- x/z (IR 13) x//y, y//w, z/w |- x//z x, y, z, w: query nodes ai/bj: nodes labelled by a/b 16
Query Processing 1. 2. 3. 4. Full form Satisfiability Redundant nodes Canonical form r a c c b a d IR 4 INFERENCE RULES (IR 1) |- r//ai (IR 2) x/y |- x//y (IR 3) x//y, y//z |- x//z (IR 4) x/ai, x//bj |- ai//bj (IR 5) ai/x, bj//x |- bj//ai (IR 6) x/y, y/w, x//z, z//w |- x/z (IR 7) x/y, x//z, w//y |- x/z (IR 8) x/y, y/w, x/z |- z/w (IR 9) x//y, y//w, x/z |- z//w (IR 10) x/y, w/z |- x/z (IR 11) x//y, w//z |- x//z (IR 12) x/y, y/w, z/w |- x/z (IR 13) x//y, y//w, z/w |- x//z x, y, z, w: query nodes ai/bj: nodes labelled by a/b 17
Query Processing 1. 2. 3. 4. Full form Satisfiability Redundant nodes Canonical form r a IR 6 c c b a d IR 8 INFERENCE RULES (IR 1) |- r//ai (IR 2) x/y |- x//y (IR 3) x//y, y//z |- x//z (IR 4) x/ai, x//bj |- ai//bj (IR 5) ai/x, bj//x |- bj//ai (IR 6) x/y, y/w, x//z, z//w |- x/z (IR 7) x/y, x//z, w//y |- x/z (IR 8) x/y, y/w, x/z |- z/w (IR 9) x//y, y//w, x/z |- z//w (IR 10) x/y, w/z |- x/z (IR 11) x//y, w//z |- x//z (IR 12) x/y, y/w, z/w |- x/z (IR 13) x//y, y//w, z/w |- x//z x, y, z, w: query nodes ai/bj: nodes labelled by a/b 18
Query Processing 1. 2. 3. 4. Full form Satisfiability Redundant nodes Canonical form r c c a a b d INFERENCE RULES (IR 1) |- r//ai (IR 2) x/y |- x//y (IR 3) x//y, y//z |- x//z (IR 4) x/ai, x//bj |- ai//bj (IR 5) ai/x, bj//x |- bj//ai (IR 6) x/y, y/w, x//z, z//w |- x/z (IR 7) x/y, x//z, w//y |- x/z (IR 8) x/y, y/w, x/z |- z/w (IR 9) x//y, y//w, x/z |- z//w (IR 10) x/y, w/z |- x/z (IR 11) x//y, w//z |- x//z (IR 12) x/y, y/w, z/w |- x/z (IR 13) x//y, y//w, z/w |- x//z x, y, z, w: query nodes ai/bj: nodes labelled by a/b 19
Query Processing 1. 2. 3. 4. Full form Satisfiability Redundant nodes Canonical form r c c a a b d A query is unsatisfiable if its full form contains a trivial cycle: x y 20
Query Processing 1. 2. 3. 4. Full form Satisfiability Redundant nodes Canonical form A node y is redundant if one of the following patterns occur: a) x y r y c b a d x y c a c) b) y y y z 21
Query Processing 1. 2. 3. 4. Full form Satisfiability Redundant nodes Canonical form r canonical form of satisfiable query = full form – IR 2 – IR 3 – redundant nodes c a a b d 22
Canonical Form r partial path query directed acyclic graph with same-path constraint d b c e r b c d e partial tree-pattern query directed acyclic graph with same-path constraints 23
Conclusions (up to now) n n Need for queries with partial structure We introduce partial queries Partial queries can be expressed in XPath We can process any partial query dag 24
ΕΥΕΛΙΚΤΗ ΑΝΑΖΗΤΗΣΗ ΣΕ ΔΕΔΟΜΕΝΑ XML Partial queries Query processing Query evaluation Query containment Experiments Conclusion
Evaluation Algorithms Partial Path Queries PQGen: Produce path queries Path. Join: Decompose into paths Partial. MJ: Dec. into spanning tree paths Partial. Path. Stack: novel holistic r d b c e r b c d e Partial Tree-Pattern Queries TPQGen: Produce TPQs PPJoin: Decompose into PPs Partial. Tree. Stack: novel holistic 26
Partial Path Queries: PQGen Producing all possible path queries… r r r b b d d d b b e c e b e c c r d b c e 1. Produce all possible path queries 2. Evaluate paths using existing algorithms 3. Keep all results 27
Partial Path Queries: PQGen Producing all possible path queries… r r r b b d d d b b e c e b e c c r d b c e 1. Produce all possible path queries 2. Evaluate paths using existing algorithms 3. Keep all results 28
Partial Path Queries: PQGen Producing all possible path queries… r r r b b d d d b b e c e b e c c r d b c e 1. Produce all possible path queries 2. Evaluate paths using existing algorithms 3. Keep all results 29
Partial Path Queries: Path. Join Decomposing into root-to-leaf paths… r r d b c e r r d b c c d e 1. Decompose into root-to-leaf paths 2. Evaluate paths using existing algorithms 3. Join conditions (identity , path ) 30
Partial Path Queries: Path. Join Decomposing into root-to-leaf paths… r r d b c e r r d b c c d e 1. Decompose into root-to-leaf paths 2. Evaluate paths using existing algorithms 3. Join conditions (identity , path ) 31
Partial Path Queries: Path. Join Decomposing into root-to-leaf paths… r r d b c e r r d b c c d e 1. Decompose into root-to-leaf paths 2. Evaluate paths using existing algorithms 3. Join conditions (identity , path ) 32
Partial Path Queries: Partial. MJ Using a spanning tree… r r b d b c e r d c e r d b c e 1. Create a spanning tree of the query 2. Decompose into root-to-leaf paths 3. Evaluate paths using an extension of Path. Stack 4. Join conditions (identity , structural , path ) 33
Partial Path Queries: Partial. MJ Using a spanning tree… r r b d b c e r d c e r d b c e 1. Create a spanning tree of the query 2. Decompose into root-to-leaf paths 3. Evaluate paths using an extension of Path. Stack 4. Join conditions (identity , structural , path ) 34
Partial Path Queries: Partial. MJ Using a spanning tree… r r b d b c e r d c e r d b c e 1. Create a spanning tree of the query 2. Decompose into root-to-leaf paths 3. Evaluate paths using an extension of Path. Stack 4. Join conditions (identity , structural , path ) 35
Partial Path Queries: Partial. MJ Using a spanning tree… r r b d b c e r d c e r d b c e 1. Create a spanning tree of the query 2. Decompose into root-to-leaf paths 3. Evaluate paths using an extension of Path. Stack 4. Join conditions (identity , structural , path ) 36
Partial Path Queries: Partial. Path. Stack tree r r b b 1 d d 1 c e Results: Sr Sb Sd Sc Se leaf nodes Partial. Path. Stack c 1 r e 1 d b d 2 c 2 leaf node Path. Stack e 2 c Results: e Sr Sb Sd Sc Se 37
Partial Path Queries: Partial. Path. Stack tree r Path. Stack r b 1 d d 1 e Results: Sb Sd Sc Se leaf nodes Partial. Path. Stack c 1 r e 1 d b d 2 c Sr leaf node e 2 c Results: e r Sr Sb Sd Sc Se 38
Partial Path Queries: Partial. Path. Stack tree r Path. Stack r b 1 d d 1 e Results: b 1 Sb Sd Sc Se leaf nodes Partial. Path. Stack c 1 r e 1 d b d 2 c Sr leaf node e 2 c Results: e r Sr b 1 Sb Sd Sc Se 39
Partial Path Queries: Partial. Path. Stack tree r Path. Stack r b 1 d d 1 e Results: b 1 Sb d 1 Sd Sc Se leaf nodes Partial. Path. Stack c 1 r e 1 d b d 2 c Sr leaf node e 2 c Results: e r Sr b 1 Sb d 1 Sd Sc Se 40
Partial Path Queries: Partial. Path. Stack tree r Path. Stack r b 1 d d 1 e Results: b 1 Sb d 1 Sd c 1 Sc Se leaf nodes Partial. Path. Stack c 1 r e 1 d b d 2 c Sr leaf node e 2 c Results: e r Sr b 1 Sb d 1 Sd c 1 Sc Se 41
Partial Path Queries: Partial. Path. Stack tree r Path. Stack r b 1 d d 1 e b 1 Sb d 1 Sd Results: ra 1 b 1 d 1 c 1 e 1 c 1 e 1 Sc Se leaf nodes Partial. Path. Stack c 1 r e 1 d b d 2 c Sr leaf node e 2 c e r Sr Results: ra 1 b 1 d 1 c 1 e 1 b 1 Sb d 1 Sd c 1 Sc e 1 Se 42
Partial Path Queries: Partial. Path. Stack tree r b 1 r Path. Stack b r d d 1 e b 1 Sb d 2 d 1 Sd Results: ra 1 b 1 d 1 c 1 e 1 r e 1 d b d 2 e 2 c e r Sr Results: ra 1 b 1 d 1 c 1 e 1 b 1 Sb c 1 Sc Se leaf nodes Partial. Path. Stack c 1 c 2 c Sr leaf node d 2 d 1 Sd c 1 Sc e 1 Se 43
Partial Path Queries: Partial. Path. Stack tree r b 1 r Path. Stack b r d d 1 e b 1 Sb d 2 d 1 Sd Results: ra 1 b 1 d 1 c 1 e 1 r e 1 d b d 2 e 2 c e r Sr b 1 Sb c 2 c 1 Sc Se leaf nodes Partial. Path. Stack c 1 c 2 c Sr leaf node d 2 d 1 Sd Results: ra 1 b 1 d 1 c 1 e 1, ra 1 b 1 d 1 c 2 e 1 c 2 c 1 Sc e 1 Se 44
Partial Path Queries: Partial. Path. Stack tree r b 1 r Path. Stack b r d d 1 e b 1 Sb d 2 d 1 Sd Results: ra 1 b 1 d 1 c 1 e 1, ra 1 b 1 d 1 c 1 e 2 Partial. Path. Stack c 1 r e 1 d b d 2 c Sr leaf node e 2 c e r Sr b 1 Sb d 2 d 1 Sd c 2 c 1 Sc e 2 Se leaf nodes c 2 c 1 Sc Results: ra 1 b 1 d 1 c 1 e 1, ra 1 b 1 d 1 c 2 e 1, ra 1 b 1 d 1 c 1 e 2 e 1 Se 45
Partial Path Queries: Partial. Path. Stack tree r r b b 1 d Path. Stack [Bruno et al, 2002] Optimal for path queries: O(input + output) c e d 1 Partial. Path. Stack [Souldatos et al, 2007] c 1 r e 1 c 2 d b d 2 e 2 Optimal for partial path queries: O(input*indegree+output*outdegree) c e 46
Partial Path Queries: Comparison Problems: Algorithm: PQGen (path queries) Path. Join (dec. to paths) Partial. MJ (spanning tree) Partial. Path. Stack Many queries to evaluate Path Intermediate overlaps results 47
Evaluation Algorithms Partial Path Queries PQGen: Produce path queries Path. Join: Decompose into paths Partial. MJ: Dec. into spanning tree paths Partial. Path. Stack: novel holistic r d b c e r b c d e Partial Tree-Pattern Queries TPQGen: Produce TPQs Partial. Path. Join: Decompose into PPs Partial. Tree. Stack: novel holistic 48
Partial Tree-Pattern Queries: TPQGen Producing all possible tree-pattern queries… r b c d e c r r b d d b e e c 1. Produce all possible tree-pattern queries 2. Evaluate queries using existing algorithms 3. Keep all results 49
Partial Tree-Pattern Queries: TPQGen Producing all possible tree-pattern queries… r b c d e c r r b d d b e e c 1. Produce all possible tree-pattern queries 2. Evaluate queries using existing algorithms 3. Keep all results 50
Partial Tree-Pattern Queries: TPQGen Producing all possible tree-pattern queries… r b c d e c r r b d d b e e c 1. Produce all possible tree-pattern queries 2. Evaluate queries using existing algorithms 3. Keep all results 51
Partial Tree-Pattern Queries: Partial. Path. Join Decomposing into partial paths… r b c d e b r r d d c e 1. Decompose into partial paths 2. Evaluate partial paths using Partial. Path. Stack 3. Join conditions (identity ) 52
Partial Tree-Pattern Queries: Partial. Path. Join Decomposing into partial paths… r b c d e b r r d d c e 1. Decompose into partial paths 2. Evaluate partial paths using Partial. Path. Stack 3. Join conditions (identity ) 53
Partial Tree-Pattern Queries: Partial. Path. Join Decomposing into partial paths… r b c d e b r r d d c e 1. Decompose into partial paths 2. Evaluate partial paths using Partial. Path. Stack 3. Join conditions (identity ) 54
Partial Tree-Pattern Queries: Partial. Tree. Stack Twig. Stack tree r Partial. Tree. Stack r b r Sr b 1 Sb d 1 Sr d c e Sc e 1 c Sb Sd c 1 b d e Sd Se Sc Se d 2 c 2 e 2 55
Partial Tree-Pattern Queries: Partial. Tree. Stack Twig. Stack tree r b r Sr r b 1 Sb d 1 Partial. Tree. Stack r Sr d c e Sc e 1 b c Sb Sd c 1 r d e Sd Se Sc Se d 2 c 2 e 2 56
Partial Tree-Pattern Queries: Partial. Tree. Stack Twig. Stack tree r b 1 Sb d 1 r Partial. Tree. Stack b r Sr d c e b 1 Sb Sd c 1 Sc e 1 r Sr r b c d e Sd Se Sc Se d 2 c 2 e 2 57
Partial Tree-Pattern Queries: Partial. Tree. Stack Twig. Stack tree r b 1 Sb d 1 c 1 Sc e 1 r b r Sr d 1 Sd Partial. Tree. Stack d c e Se b 1 Sb Sc r Sr d 1 Sd r b c d e Se d 2 c 2 e 2 58
Partial Tree-Pattern Queries: Partial. Tree. Stack Twig. Stack tree r b 1 Sb d 1 c 1 Sc c 1 e 1 rb 1 d 1 c 1 r b r Sr d 1 Sd Partial. Tree. Stack d c e Se b 1 Sb c 1 Sc r Sr d 1 Sd r b c d e Se rd 1 b 1 c 1 d 2 c 2 e 2 59
Partial Tree-Pattern Queries: Partial. Tree. Stack Twig. Stack tree r b 1 Sb d 1 c 1 Sc e 1 rb 1 d 1 c 1 r b r Sr d 1 Sd Partial. Tree. Stack d c e e 1 Se rb 1 d 1 e 1 b 1 Sb c 1 Sc rd 1 b 1 c 1 r Sr d 1 Sd r b c d e e 1 Se rd 1 e 1 d 2 c 2 e 2 60
Partial Tree-Pattern Queries: Partial. Tree. Stack Twig. Stack tree r b 1 Sb d 1 c 1 Sc e 1 rb 1 d 1 c 1 r b r Sr d 2 d 1 Sd Partial. Tree. Stack d c e Se rb 1 d 1 e 1 b 1 Sb c 1 Sc rd 1 b 1 c 1 r Sr d 2 d 1 Sd r b c d e e 1 Se rd 1 e 1 d 2 c 2 e 2 61
Partial Tree-Pattern Queries: Partial. Tree. Stack Twig. Stack tree r b 1 Sb d 1 c 2 Sc c 1 e 1 rb 1 d 1 c 2 rb 1 d 2 c 2 d 2 c 2 e 2 r b r Sr d 2 d 1 Sd Partial. Tree. Stack d c e Se rb 1 d 1 e 1 b 1 Sb c 2 c 1 Sc r Sr d 2 d 1 Sd rd 1 b 1 c 1 rd 1 b 1 c 2 rd 2 b 1 c 2 r b c d e e 1 Se rd 1 e 1 62
Partial Tree-Pattern Queries: Partial. Tree. Stack Twig. Stack tree r b 1 Sb d 1 c 1 Sc e 1 rb 1 d 1 c 2 rb 1 d 2 c 2 d 2 c 2 e 2 r b r Sr d 2 d 1 Sd Partial. Tree. Stack d c e e 2 Se rb 1 d 1 e 1 rb 1 d 1 e 2 rb 1 d 2 e 2 b 1 Sb c 2 c 1 Sc r Sr d 2 d 1 Sd rd 1 b 1 c 1 rd 1 b 1 c 2 rd 2 b 1 c 2 r b d c e e 2 e 1 Se rd 1 e 1 rd 1 e 2 rd 2 e 2 63
Partial Tree-Pattern Queries: Partial. Tree. Stack Twig. Stack tree r b 1 Sb d 1 c 1 Sc e 1 rb 1 d 1 c 2 rb 1 d 2 c 2 d 2 c 2 e 2 r b r Sr d 2 d 1 Sd Partial. Tree. Stack d c e Se rb 1 d 1 e 1 rb 1 d 1 e 2 rb 1 d 2 e 2 b 1 Sb c 2 c 1 Sc r Sr d 2 d 1 Sd rd 1 b 1 c 1 rd 1 b 1 c 2 rd 2 b 1 c 2 r b d c e e 2 e 1 Se rd 1 e 1 rd 1 e 2 rd 2 e 2 rb 1 d 1 c 1 e 1, rb 1 d 1 c 1 e 2, rb 1 d 1 c 2 e 1, rb 1 d 1 c 2 e 2, rb 1 d 2 c 2 e 2 64
Partial Tree-Pattern Queries: Partial. Tree. Stack Twig. Stack tree r b b 1 d d 1 c e Partial. Tree. Stack r b c d e c 1 O(input + output) O(input*|Q|*|PP|+output*N) e 1 Optimal for tree-pattern queries Optimal for “small” partial tree-pattern queries d 2 c 2 r e 2 |Q|=nodes+edges |PP|=No of PPs N=nodes 65
Partial Tree-Pattern Queries: Comparison Problems: Algorithm: TPQGen (TPQs) Partial. Path. Join (dec. to PPs) Many queries to evaluate Path Intermediate overlaps results Partial. Tree. Stack 66
Conclusions (up to now) n n n Need for queries with partial structure We introduce partial queries Partial queries can be expressed in XPath We can process any partial query dag We proposed algorithms for their evaluation 67
ΕΥΕΛΙΚΤΗ ΑΝΑΖΗΤΗΣΗ ΣΕ ΔΕΔΟΜΕΝΑ XML Partial queries Query processing Query evaluation Query containment Experiments Conclusion
Absolute Query Containment Each result of Q 1 is a result of Q 2. Q 1 Q 2 Q 1 Q 2 r a b r a b c c 69
Absolute Query Containment Each result of Q 1 is a result of Q 2. Q 1 Q 2 homomorphism from Q 2 to the full form of Q 1 Q 2 r a b r a b c c 70
Absolute Query Containment Each result of Q 1 is a result of Q 2. Q 1 Q 2 homomorphism from Q 2 to the full form of Q 1 Q 2 r a b r a b c c 71
Absolute Query Containment Each result of Q 1 is a result of Q 2. Q 1 Q 2 homomorphism from Q 2 to the full form of Q 1 Q 2 r a b c r a b c => Checking absolute query containment is very fast (homomorphism) 72
Relative Query Containment Some important stuff first: 1. Dimension graphs: summarize the structure of an XML tree: XML Tree Dimension graph 73
Relative Query Containment Some important stuff first: 2. Dimension trees: equivalent to a query in a specific dimension graph DT 1. 1 Q 1 Dimension graph + = 74
Relative Query Containment Some important stuff first: 2. Dimension trees: equivalent to a query in a specific dimension graph DT 2. 2 DT 2. 1 Dimension graph Q 2 + = 75
Relative Query Containment Q 1 G Q 2 Each result of Q 1 in G is a result of Q 2 in G. Q 1 Q 2 Dimension graph G 76
Relative Query Containment Q 1 G Q 2 Each result of Q 1 in G is a result of Q 2 in G. homomorphism from the Dimension Trees of Q 2 to the Dimension Trees of Q 1 Q 2 Dimension graph G 77
Relative Query Containment Q 1 G Q 2 DT 1. 1 DT 2. 2 DT 2. 1 Each result of Q 1 in G is a result of Q 2 in G. homomorphism from the Dimension Trees of Q 2 to the Dimension Trees of Q 1 G 78
Relative Query Containment Q 1 G Q 2 DT 1. 1 DT 2. 2 DT 2. 1 G Each result of Q 1 in G is a result of Q 2 in G. homomorphism from the Dimension Trees of Q 2 to the Dimension Trees of Q 1 => Checking relative query containment can be very slow (#dimension trees) 79
Heuristic for Relative Cont. 1. Extract info from the dimension graph 2. Add it to Q 1 3. Check Q 1 Q 2 Dimension graph G 80
Heuristic for Relative Cont. 1. Extract info from the dimension graph : 2. Add it to Q 1 3. Check Q 1 Q 2 Dimension graph G 81
Heuristic for Relative Cont. 1. Extract info from the dimension graph : 2. Add it to Q 1 3. Check Q 1 Q 2 Dimension graph G 82
Heuristic for Relative Cont. 1. Extract info from the dimension graph : 2. Add it to Q 1 3. Check Q 1 Q 2 OK Dimension graph G 83
ΕΥΕΛΙΚΤΗ ΑΝΑΖΗΤΗΣΗ ΣΕ ΔΕΔΟΜΕΝΑ XML Partial queries Query processing Query evaluation Query containment Experiments Conclusion
Queries Used in the Experiments r a r b a c a b d d b f c e e Q 1/Q 5 r f d e Q 2/Q 6 Q 3/Q 7 r b c a f Q 4/Q 8 d e 85
Query Evaluation Execution time on Treebank… 2. 5 million nodes 86
Query Evaluation Execution time on Treebank… 2. 5 million nodes path queries 87
Query Evaluation Execution time on Treebank… 2. 5 million nodes too many results 88
Query Evaluation Execution time on Synthetic data… 2. 5 million nodes (IBM Alpha. Works XML generator) 89
Query Evaluation Execution time varying the size of the XML tree… Q 2 Partial. MJ Partial. Path. Stack Q 3 Partial. MJ Partial. Path. Stack Q 7 Partial. MJ Partial. Path. Stack 90
Query Containment Execution time varying the graph size… Relative Containment Time (sec) On-The-Fly Heuristic Precomputed Heuristics Number of Graph Paths Heuristic accuracy > 98% > 90% > 78% > 60% 91
Query Containment Execution time varying the query size… Relative Containment Time (sec) On-The-Fly Heuristic Precomputed Heuristics Number of Nodes per Query Path Heuristic accuracy > 98% > 79% > 32% 92
Conclusions (up to now) n n n Need for queries with partial structure We introduce partial queries Partial queries can be expressed in XPath We can process any partial query dag We proposed algorithms for their evaluation We showed that our algorithms for evaluation and containment outperform other techniques 93
ΕΥΕΛΙΚΤΗ ΑΝΑΖΗΤΗΣΗ ΣΕ ΔΕΔΟΜΕΝΑ XML Partial queries Query processing Query evaluation Query containment Experiments Conclusion
Conclusions n n n Need for queries with partial structure We introduce partial queries Partial queries can be expressed in XPath We can process any partial query dag We proposed algorithms for their evaluation We showed that our algorithms for evaluation and containment outperform other techniques 95
Contribution Evaluation Partial Path Queries Partial Tree-Pattern Queries CIKM ’ 07 WWW ’ 08 EDBT ’ 09? ? Containment SSDBM ’ 06 VLDB Journal ’ 08 Heuristics for Containment CIKM ’ 06 CIKM ’ 08 96
Publications n QUERY EVALUATION ¨ Stefanos Souldatos, Xiaoying Wu, Dimitri Theodoratos, Theodore Dalamagas, Timos Sellis. Evaluation of Partial Path Queries on XML Data. 16 th CIKM Conference, Lisboa, Portugal, 2007. ¨ Xiaoying Wu, Stefanos Souldatos, Dimitri Theodoratos, Theodore Dalamagas, Timos Sellis. Efficient Evaluation of Generalized Path Pattern Queries on XML Data. 17 th WWW Conference, Beijing, China, 2008. 97
Publications n QUERY CONTAINMENT ¨ Dimitri Theodoratos, Theodore Dalamagas, Pawel Placek, Stefanos Souldatos, Timos Sellis. Containment of Partially Specified Tree-Pattern Queries. 18 th SSDBM Conference, Vienna, Austria, 2006. ¨ Dimitri Theodoratos, Pawel Placek, Theodore Dalamagas, Stefanos Souldatos, Timos Sellis. Containment of Partially Specified Tree-Pattern Queries in the Presence of Dimension Graphs. VLDB Journal, 2008. 98
Publications n HEURISTICS FOR CONTAINMENT ¨ Dimitri Theodoratos, Stefanos Souldatos, Theodore Dalamagas, Pawel Placek, Timos Sellis. Heuristic Containment Check of Partial Tree-Pattern Queries in the Presence of Index Graphs. 15 th CIKM Conference, Arlington, USA, 2006. ¨ Pawel Placek, Dimitri Theodoratos, Stefanos Souldatos, Theodore Dalamagas, Timos Sellis. Heuristic Approaches for Checking Containment of Generalized Tree-Pattern Queries. 17 th CIKM Conference, Napa Valley, California, USA, 2008. 99
Publications n WEB SEARCH PERSONALIZATION ¨ Stefanos Souldatos, Theodore Dalamagas, Timos Sellis. Sailing the Web with Captain Nemo: a Personalized Metasearch Engine. Learning in Web Search Workshop, 22 nd ICML Conference, Bonn, Germany, 2005. ¨ Stefanos Souldatos, Theodore Dalamagas, Timos Sellis. Captain Nemo: A Metasearch Engine with Personalized Hierarchical Search Space. Informatica Journal, 2006. ¨ Stefanos Souldatos, Theodore Dalamagas, Timos Sellis. Sailing the Web with Captain Nemo: a Personalized Metasearch Engine. Internet Search Engines (book), ICFAI University (Institute of Chartered Financial Analysts of India). Reprint of the publication in Learning in Web Search Workshop, 2007. 100
Questions? Partial queries Query processing Query evaluation Query containment Experiments Conclusion
- Sql query for xml
- Dns recursive iterative
- Query tree and query graph
- Query tree and query graph
- Abstracted query engine
- Using subqueries to solve queries
- Standing query
- Action queries in access
- Dimensional modeling basics
- Multirelation queries
- Wildcard queries in information retrieval
- Any queries
- Complex sql join queries
- Basic retrieval queries in sql
- Hotel.hotelno=room.hotelno(hotel room)
- Sql queries for insert update and delete
- Answering my queries
- Wide world importers sample queries
- Sql queries for banking database
- Conjunctive queries
- Stefano grazioli
- Join ordering in fragment queries
- Teradata ordbms
- For any queries
- Exam queries scdl
- Data manipulation language in sql
- Suggestions and queries
- Any queries images
- J query
- Eyegaze
- Texas rrc permit query
- Texas railroad commission online queries
- Codeapillar troubleshooting
- Ir queries
- Steps in query processing
- Query optimization steps
- Steps of query processing
- Measures of query cost in dbms
- Objectives of query processing
- Steps of query processing
- Steps in query processing
- Algorithms for query processing and optimization
- Distributed query processing in dbms
- Layers of query processing
- Sketch techniques for approximate query processing
- What is the role of eddy in adaptive query processing
- Distributed query processing
- Distributed query processing
- Distributed query processing
- Basic steps in query processing
- Sql server intelligent query processing
- Bottom-up processing examples
- Bottom up processing vs top down processing
- Bottom up and top down processing
- Neighborhood processing in digital image processing
- Primary processing of wheat
- Define point processing
- Histogram processing in digital image processing
- Parallel processing vs concurrent processing
- Neighborhood processing in digital image processing
- Point processing in image processing
- Digital image processing
- Top down processing example
- What is interactive processing
- Zig web framework
- Xml user interface language
- Ms xml parser
- Xray xml editor
- Dublin core xml
- Java soap xml 파싱
- Mitarbeiterverwaltung open source
- I xml
- Xml stands for: cs101
- What is xml gateway
- Xml usage
- Xml dom
- Textml server reviews
- Extracting data from xml
- Specifications and constraints
- Xml meaning
- Xml vs xbrl
- Xml based web services
- Ximpleware
- Sas xml mapper
- Vi format xml
- Xml to kml
- Single source publishing tools
- Soa xml
- Prolog in xml
- Php xml dom
- Open xml productivity tool
- Oodb and xml database
- Xml4
- Json is a lightweight substitute for xml
- Jeus nodemanager
- Xml.aimsweb
- Element in xml
- Internal dtd
- Parse xml in power automate
- Relationship between html sgml and xml
- Syntax xml
- Xml basics