Semantic Adaptation of Schema Mappings when Schemas Evolve

  • Slides: 28
Download presentation
Semantic Adaptation of Schema Mappings when Schemas Evolve Cong Yu University of Michigan Lucian

Semantic Adaptation of Schema Mappings when Schemas Evolve Cong Yu University of Michigan Lucian Popa IBM Almaden Research Center 09/02/2005 VLDB’ 05, Trondheim, Norway Sep 2, 2005 Semantic Adaptation of Schema Mappings when–Schemas Evolve VLDB'05

Schema Mappings Schema S Schema T I J q’ n n q Schema Mappings

Schema Mappings Schema S Schema T I J q’ n n q Schema Mappings are logical, declarative, assertions that can describe relationships between schemas. q enough semantics to guide run-time, instance-level, transformation q e. g. , GLAV mappings (or tuple-generating dependencies) They are key elements in two main areas in information integration: q Data Exchange/Translation q Query Answering/Rewriting (or Federation) 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 2

Schema Evolution and Mapping Adaptation n Schemas evolve over time … Mappings may become

Schema Evolution and Mapping Adaptation n Schemas evolve over time … Mappings may become invalid ! A lot of effort goes into establishing mappings. How do we reuse them ? Mapping Adaptation Problem [VMP’ 03] q q 09/02/2005 Given: n mapping M from S to T, n changes/evolution of S to S’, or T to T’, or both, Derive a “best” mapping M’ that: n is valid with respect to the new schemas, and n reflects the original mapping as much as possible Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 3

Prior Solution: Incremental Method S M T M 1 move elem T 1 M

Prior Solution: Incremental Method S M T M 1 move elem T 1 M 2 M 3 add elem T 2 delete constraint T 3 rename elem … n [VMP’ 03] Incrementally adapts the mapping after each atomic change in the schemas (source and/or target). n Efficient and intuitive, for one or few changes. n However, for non-incremental evolution, there are drawbacks … 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 4

S M T Different evolution paths Mn Tn n The new schema may be

S M T Different evolution paths Mn Tn n The new schema may be radically different q q n The method will ultimately be inefficient: q n The list of changes may not be known. Evolution path must be discovered not necessarily unique The algorithm must be applied at each atomic change As we shall see, the resulting mapping may not be the expected one. 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 5

Our Approach: Composition-Based S M T E M’ = M ° E n n

Our Approach: Composition-Based S M T E M’ = M ° E n n T’ Evolution itself is described as a schema mapping. q Concise, declarative, and expressive description of evolution. q Enables efficiency and can deal with arbitrary evolution The adapted mapping is then obtained via composition. q n Can use schema mapping tools (e. g. , Clio) to construct E. Formal semantics of adaptation. At high level, this is part of the model management vision [Ber 03]. 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 6

Main Contributions n We study the interplay between schema evolution and mapping composition q

Main Contributions n We study the interplay between schema evolution and mapping composition q n 09/02/2005 interesting in terms of both semantics and implementation We show that the composition-based approach for mapping adaptation can be practical Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 7

Outline of the Rest of the Talk n Incremental Approach vs. Composition Approach q

Outline of the Rest of the Talk n Incremental Approach vs. Composition Approach q n Example (showing why composition is important) Composition: Semantics and Algorithm q Transformation semantics specialized, more suitable for schema evolution, also more challenging n Optimization and Experiments q Compose only when necessary (Some mapping formulas are unaffected by the change) n 09/02/2005 Conclusion Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 8

Simplified Example Source’ Line. Item li s p o qty n n Supp. Part

Simplified Example Source’ Line. Item li s p o qty n n Supp. Part s p Part. Order p o Target m: m Potential. Supp s o Supp. Part (s, p) Λ Part. Order (p, o) Potential. Supp (s, o) ( GLAV mapping [Halevy 01], or, source-to-target tgd [FKMP 03] ) The mapping m “exports” orders o and all their potential suppliers s. Schema evolution scenario: q n Source Data arrives in “long” tuples, each relating an order, order a part and an available supplier The mapping m must be adapted to use new schema Source’. 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 9

Incremental Approach [VMP’ 03] Source’ Line. Item li s p o qty n n

Incremental Approach [VMP’ 03] Source’ Line. Item li s p o qty n n Source Supp. Part s p Part. Order p o Target m: m Potential. Supp. Part (s, p) Λ Part. Order (p, o) Potential. Supp (s, o) s o Pick a list of changes from Source to Source’ and rewrite mapping after each change. (1) Move element Supp. Part/s to Part. Order/s: Supp. Part (p) Λ Part. Order (s, p, o) Potential. Supp (s, o) n n (2) Delete Supp. Part/p and (3) delete Supp. Part. (4) Rename Part. Order to Line. Item, (5) add Line. Item/li and (6) add Line. Item/qty: m’: 09/02/2005 Line. Item (li, s, p, o, qty) Potential. Supp (s, o) Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 10

n Although small, our example already needs 6 schema changes. q n For large

n Although small, our example already needs 6 schema changes. q n For large schemas, this can become challenging Furthermore, and somewhat surprisingly, the semantics of the adapted mapping may not be the “expected” one ! 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 11

Loss of Semantics Source’ Line. Item li s p o qty Source Supp. Part

Loss of Semantics Source’ Line. Item li s p o qty Source Supp. Part s p Part. Order p o Target m Potential. Supp s o m: Supp. Part (s, p) Λ Part. Order (p, o) Potential. Supp (s, o) m’: Line. Item (li, s, p, o, qty) Potential. Supp (s, o) n The original mapping m joins orders with suppliers n However, m’ loses relevant suppliers q n It only pairs an order with a supplier provided they appear in the same Line. Item tuple To retain the original semantics, we must look in different tuples ! m’’: 09/02/2005 Line. Item (li, s, p, o, qty) Λ Line. Item (li’, s’, p, o’, qty’) Potential. Supp (s’, o) Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 12

n n The incremental approach is a “mechanical” procedure that makes local changes to

n n The incremental approach is a “mechanical” procedure that makes local changes to the mapping. A sequence of good local changes may not necessarily yield the best global adaptation … 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 13

Mapping Composition Approach n We look at the evolution globally: q Describe evolution through

Mapping Composition Approach n We look at the evolution globally: q Describe evolution through a schema mapping Source’ Line. Item li s p o qty q q Source Supp. Part s e 1 p Part. Order p e 2 o Target m Potential. Supp e : 1 s e 2: o Line. Item (l, s, p, o, q) -> Supp. Part (s, p) Line. Item (l, s, p, o, q) -> Part. Order (p, o) Define the adapted mapping to be a mapping Source’ Target, equivalent (e. g. , same data movement) to the sequence of the evolution mapping and the original mapping. The previous m’’ satisfies the conditions for {e 1, e 2} and {m}. 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 14

n The composition approach is a more systematic approach, with precise semantics, guaranteed to

n The composition approach is a more systematic approach, with precise semantics, guaranteed to behave the “right” way in all situations. n 09/02/2005 Although it may appear simple in the previous example, mapping composition poses challenges … Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 15

Challenges in Composition Approach n Mapping language: q Must handle nesting and complex types

Challenges in Composition Approach n Mapping language: q Must handle nesting and complex types (as in XML Schema) n q (details in the paper) Furthermore, the usual mapping languages (GLAV, tuplegenerating dependencies) are not closed under composition ! n n Recent extension that ensures composability: second-order tgds [FKPT 04]. Main idea: add functions to gain needed expressive power n Semantics and Algorithm n Efficiency/Scalability 09/02/2005 Next … Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 16

Composition: Semantics and Algorithm 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve -

Composition: Semantics and Algorithm 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 17

Composition: Semantics n In mapping composition, we want to replace a sequence of schema

Composition: Semantics n In mapping composition, we want to replace a sequence of schema mappings with one that is “equivalent” and avoids the middle schema. n What does “equivalent” mean ? n There are two semantics that we considered: q Relationship semantics n q Transformation semantics n 09/02/2005 More general More suitable, specialized Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 18

Relationship Semantics n Mappings can be viewed as describing relationships between instances over the

Relationship Semantics n Mappings can be viewed as describing relationships between instances over the two schemas Rel (M 12) = { (I 1, I 2) | (I 1, I 2) satisfies M 12 } n Composition of relationships: Rel (M 12) ◦ Rel (M 23) = { (I 1, I 3) | there is I 2 such that (I 1, I 2) satisfies M 12 and (I 2, I 3) satisfies M 23 } n [FKPT 04, Melnik 04] A mapping M 13 is equivalent, to the sequence of M 12 and M 23, under the relationship semantics, if: Rel (M 13) = Rel (M 12) ◦ Rel (M 23) 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 19

Example: Semantics and Algorithm S 1 Takes: name course S 2 Student: sid M

Example: Semantics and Algorithm S 1 Takes: name course S 2 Student: sid M name Enrolls: sid course n S 3 E Takes’: sid name course M: Takes (n, c) Student (F(n), n) Enrolls (F(n), c) E: Student (s, n) Enrolls (s, c) Takes’ (s, n, c) M 13 correctly captures the equivalent relationship between instances of S 1 and S 3. q Instances (and function F) can exist a priori. q A student n must be paired with a course c n n “Unknown” student id Second-order tgd [FKPT 04] even when c is listed under a different student name n’, provided the student id is the same: 1. Substitution M 13: Takes (n, c’) Takes (n’, c) F(n) = F(n’) Takes’ (F(n), n, c) F(n) = F(n’) 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 20

n However, if we assume that the function F is one-to-one, an important simplification

n However, if we assume that the function F is one-to-one, an important simplification can be made … M 13: Takes (n, c’) Takes (n’, c) F(n) = F(n’) Takes’ (F(n), n, c) Equivalent relationship 2. Reduction F(n) = F(n’) n = n’ M’ 13: Takes (n, c’) Takes (n, c) Takes’ (F(n), n, c) 3. Minimization M’’ 13: Takes (n, c) Takes’ (F(n), n, c) Equivalent transformation n We can always make this assumption, if mappings are meant to describe transformations (i. e. , generation of a target instance). q 09/02/2005 F is a Skolem function assigning unique student ids: n F(n) Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 21

Transformation Semantics n A mapping is a process (in the spirit of data exchange

Transformation Semantics n A mapping is a process (in the spirit of data exchange [FKMP 03]): I 2 = M 12(I 1) n q Each mapping formula is a “generator” of target facts q Functions are one-to-one value generators Theorem. Our composition algorithm produces the schema mapping with the equivalent transformation semantics: M 13 (I 1) = M 23( M 12(I 1) ) (up to the renaming of nulls) Advantage of transformation semantics, in adaptation: simpler and more intuitive formulas ! 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 22

Composition Algorithm: Further Details n n The substitution step is more complex than shown:

Composition Algorithm: Further Details n n The substitution step is more complex than shown: q Must handle nesting q Generate parameterized rules for set types in the middle schema q Reuse some of the mapping-based query rewriting techniques [YP 04] Minimization: q Good: it simplifies formulas and generates intuitive mapping. (all this is enabled by the transformation semantics) q Bad: it can be expensive (same as tableau minimization) … 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 23

Optimization and Experimental Results 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve -

Optimization and Experimental Results 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 24

Full Adaptation Full adaptation Compose “whole” schema mappings (Compose all the formulas in the

Full Adaptation Full adaptation Compose “whole” schema mappings (Compose all the formulas in the original mapping with all the formulas in the evolution mapping) q q 09/02/2005 Inevitable when the schema evolution is drastic and affects most of the original mapping (non-incremental evolution) Inefficient when the changes are small and localized (incremental evolution) Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 25

Compose Only When Necessary Mapping Pruning: 1. Detect those parts (formulas) M’o of the

Compose Only When Necessary Mapping Pruning: 1. Detect those parts (formulas) M’o of the original mapping Mo that are affected by evolution. q Only M’o need to be adapted. 2. Only a subset M’e of the formulas in the evolution mapping Me play a role in the composition with M’o q The rest are redundant (PTIME containment-like analysis, see paper) 3. Compose M’o with M’e Big performance gain for incremental evolution and overall. 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 26

Analysis of Evolution Scenarios Results based on Clio Benefits = 1 – adapted mappings

Analysis of Evolution Scenarios Results based on Clio Benefits = 1 – adapted mappings / (blank-sheet mappings + missed mappings) We also have synthetic scenarios that show scalability of Mapping Pruning with increasing schema and mapping complexity 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 27

Conclusion n n We studied: q Mapping composition techniques for mapping adaptation q Transformation

Conclusion n n We studied: q Mapping composition techniques for mapping adaptation q Transformation semantics in the context of schema evolution Designed and implemented a practical adaptation system q n Mapping pruning (schema evolution specific) To Do: q Optimization of composition in general q Improve performance of minimization 09/02/2005 Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05 28