Semantic Matching Pavel Shvaiko Paper with Fausto Giunchiglia

  • Slides: 31
Download presentation
Semantic Matching Pavel Shvaiko Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,

Semantic Matching Pavel Shvaiko Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia, Pavel Shvaiko, Mikalai Yatskevich, Ilya Zaihrayeu Stanford University, October 31, 2003

2 Outline Matching Syntactic Matching Semantic Matching On Implementing Semantic Matching Conclusions Stanford University,

2 Outline Matching Syntactic Matching Semantic Matching On Implementing Semantic Matching Conclusions Stanford University, October 31, 2003

3 MATCHING Stanford University, October 31, 2003

3 MATCHING Stanford University, October 31, 2003

4 Application Domains Generic Model Management Schema integration Data warehouses E-commerce Data Coordination in

4 Application Domains Generic Model Management Schema integration Data warehouses E-commerce Data Coordination in P 2 P systems, Semantic Web Stanford University, October 31, 2003

5 Example of Matching www. yahoo. com www. google. com Arts&Humanities Art History Music

5 Example of Matching www. yahoo. com www. google. com Arts&Humanities Art History Music Sr={ } Organizations Sc=1. 0 History Design Art Architecture Sr={ } History Baroque Sc=0. 9 Stanford University, October 31, 2003 Baroque

6 Matching Match is an operator that takes two graph-like structures (e. g. ,

6 Matching Match is an operator that takes two graph-like structures (e. g. , database schemas or ontologies) and produces a mapping between elements of the two graphs that correspond semantically to each other Stanford University, October 31, 2003

7 Matching The problem of matching can be decomposed in two steps: Extract graphs

7 Matching The problem of matching can be decomposed in two steps: Extract graphs from the data and conceptual models Match the resulting graphs (generic matching) Stanford University, October 31, 2003

8 Matching Mapping element is a 4 -tuple < m. ID, Ni 1, Nj

8 Matching Mapping element is a 4 -tuple < m. ID, Ni 1, Nj 2, R > m. ID is a unique identifier of the given mapping element; Ni 1 is the i-th node of the first graph, Nj 2 is the j-th node of the second graph, R specifies a similarity relation of the given nodes Mapping is a set of mapping elements Matching is the process of discovering mappings between two graphs through the application of a matching algorithm Stanford University, October 31, 2003

9 Matching: Syntactic AND Semantic Matching Syntactic Matching Semantic Matching • R is computed

9 Matching: Syntactic AND Semantic Matching Syntactic Matching Semantic Matching • R is computed between labels at nodes • R = {x [0, 1]} Stanford University, October 31, 2003 is computed between concepts at nodes = { =, , }

10 SYNTACTIC MATCHING Stanford University, October 31, 2003

10 SYNTACTIC MATCHING Stanford University, October 31, 2003

11 Syntactic Matching Mapping element is a 4 -tuple < m. ID, Li 1,

11 Syntactic Matching Mapping element is a 4 -tuple < m. ID, Li 1, Lj 2, R >, where Li 1 is the label at the i-th node of the first graph; Lj 2 is the label at the j-th node of the second graph; R specifies a similarity relation in the form of a coefficient, which measures the similarity between the labels of the given nodes Example: R is a similarity coefficient in [0, 1] R = <m 21, telephone, 0. 7> Stanford University, October 31, 2003

12 The State of the Art Cupid … is a hybrid matching prototype. It

12 The State of the Art Cupid … is a hybrid matching prototype. It exploits linguistic and structural schema matching heuristics, and computes similarity coefficients between nodes of the trees. Similarity Flooding … is a hybrid matching prototype. It uses fix-point computation to determine correspondences between nodes of the graphs. COMA …is a composite matching prototype. It provides an extensible library of different matchers which manipulate DAGs and supports various ways of combining final results. As far as we know, so far only syntactic matching… Stanford University, October 31, 2003

13 SEMANTIC MATCHING Stanford University, October 31, 2003

13 SEMANTIC MATCHING Stanford University, October 31, 2003

14 Semantic Matching Mapping element is a 4 -tuple < m. ID, Ci 1,

14 Semantic Matching Mapping element is a 4 -tuple < m. ID, Ci 1, Cj 2, R >, where Ci 1 is the concept of the i-th node of the first graph; Cj 2 is the concept of the j-th node of the second graph; R specifies a similarity relation in the form of a semantic relation between the extensions of concepts at the given nodes Possible R’s: equality {=}, overlapping { }, mismatch { }, more general/specific { , } Example: R = <m 21, telephone, {=}> Stanford University, October 31, 2003

15 Examples: Analysis of Ancestors. Case 1 Suppose that we want to match nodes

15 Examples: Analysis of Ancestors. Case 1 Suppose that we want to match nodes 51 and 12 Cupid does not find a similarity coefficient between the nodes under consideration, due to the significant differences in structure of the given graphs Semantic matching: The concept denoted by the label at node 51 is CC 1, while the concept at node 51 is C 5 = CA 1 CC 1. The concept at node 12 is C 1 = CC. Thus, C 5 C 1 1 2 Stanford University, October 31, 2003 2 1 2

16 Examples: Analysis of Ancestors. Case 2 Suppose that we want to match nodes

16 Examples: Analysis of Ancestors. Case 2 Suppose that we want to match nodes 51 and 52 Cupid: R= 0, 86. This is because of the identity of labels A 1=A 2, C 1=C 2 Semantic matching: The concept at node 51 is C 5 = CA 1 CC 1; while the concept at node 52 is C 5 = CA 2 * CC 2. Since we have that CA 1=CA 2 and CC 1=CC 2, then C 5 1 2 2 Stanford University, October 31, 2003 1

17 ON IMPLEMENTING SEMANTIC MATCHING Stanford University, October 31, 2003

17 ON IMPLEMENTING SEMANTIC MATCHING Stanford University, October 31, 2003

18 On Implementation Semantic Matching Element - level Structure - level Weak Semantics Techniques

18 On Implementation Semantic Matching Element - level Structure - level Weak Semantics Techniques Strong Semantics Techniques Stanford University, October 31, 2003

19 Element-level Semantic Matching Weak Semantics Techniques Analysis of strings {=} <phone, telephone, {=}>

19 Element-level Semantic Matching Weak Semantics Techniques Analysis of strings {=} <phone, telephone, {=}> Analysis of data types {=, , } <string, integer, { }> <integer, real, { }> Analysis of soundex {=} < Fausto, Phausto, {=}> Strong Semantics Techniques Precompiled thesaurus syn key <Discount, Rebate, {=}> Word. Net <Art_#1, Humanities_#1, { }>, where #1 … sense number 1 of the word Art according to Word. Net Stanford University, October 31, 2003

20 Element-level Semantic Matching (cont. ) Semantic Relations via Word. Net Equality: one concept

20 Element-level Semantic Matching (cont. ) Semantic Relations via Word. Net Equality: one concept is equal to another if there is at least one sense of the first concept, which is a synonym of the second Overlapping: one concept is overlapped with the other if there are some senses in common Mismatch: two concepts are mismatched if they have no sense in common More general: one concept is more general then the other iff there exists at least one sense of the first concept that has a sense of the other as a hyponym or meronym Less general: one concept is less general than the other iff there exists at least one sense of the first concept that has a sense of the other concept as hypernym or as a holonym Stanford University, October 31, 2003

21 Structure-level Semantic Matching We translate the matching problem, namely the two graphs (in

21 Structure-level Semantic Matching We translate the matching problem, namely the two graphs (in particular, the pair of nodes submitted to matching) into a propositional formula and then check for its validity We check for validity using SAT Stanford University, October 31, 2003

22 Semantic Matching Algorithm 1. Extract the two graphs 2. Compute element-level semantic matching

22 Semantic Matching Algorithm 1. Extract the two graphs 2. Compute element-level semantic matching 3. Compute concepts at nodes 4. Construct the propositional formula 5. Run SAT 6. Perform iterations Stanford University, October 31, 2003

23 Semantic Matching Algorithm: Example – (1) Extract the two graphs • In the

23 Semantic Matching Algorithm: Example – (1) Extract the two graphs • In the case of RDB, XML and OODB schemas, it is necessary to extract useful semantic information, for instance in the form of ontologies Stanford University, October 31, 2003

24 Semantic Matching Algorithm: Example – (2) Element-level semantic matching. For each node, compute

24 Semantic Matching Algorithm: Example – (2) Element-level semantic matching. For each node, compute semantic relations holding among all the concepts denoted by labels at nodes under consideration CA 1 = CA 2 C B 1 = CB 2 CC 1 = CC 2 CD 1 = CD 2 C E 1 = C E 2 Stanford University, October 31, 2003

25 Semantic Matching Algorithm: Example – (3) Compute concepts at nodes. Suppose, we want

25 Semantic Matching Algorithm: Example – (3) Compute concepts at nodes. Suppose, we want to find a semantic relation between nodes 51 and 12 ? C 1 = CA 1 C 5 = CA 1 CC 2 C 1 = CC 2 1 1 2 Stanford University, October 31, 2003

26 Semantic Matching Algorithm: Example – (4) Construct the propositional formula. We translate all

26 Semantic Matching Algorithm: Example – (4) Construct the propositional formula. We translate all the semantic relations computed in step 2 into propositional formulas under the following rules: CA 2 CA 1 §CA 1 CA 2 §CA 1 = CA 2 CA 1 CA 2 §CA 1 CA 2 (CA 1 CA 2) §CA 1 ? From step 2 we have: CC 1 CC 2. We want to prove that C 5 C 1 ( we guess relation between nodes at this stage) 1 2 (CA 1 CC 1) CC 2 (CC 1 CC 2) ((CA 1 CC 1) CC 2) … Stanford University, October 31, 2003

27 Semantic Matching Algorithm: Example – (5) Run SAT In order to prove that

27 Semantic Matching Algorithm: Example – (5) Run SAT In order to prove that (CC 1 CC 2) ((CA 1 CC 1 ) CC 2) is valid, we prove that its negation is unsatisfiabile (CC 1 CC 2) ((CA 1 CC 1) CC 2) SAT returns FALSE Thus, C 5 C 1 1 2 Stanford University, October 31, 2003

28 Example: Cupid vs. Semantic Matching www. google. com Arts www. yahoo. com {

28 Example: Cupid vs. Semantic Matching www. google. com Arts www. yahoo. com { } Arts&Humanities { } Art History Music { } Organizations Design Art Organizations Architecture History { } { } History Baroque Stanford University, October 31, 2003

29 Conclusions We have made a rational reconstruction of the major matching problems and

29 Conclusions We have made a rational reconstruction of the major matching problems and articulated them in terms of the more generic problem of matching graphs We have identified semantic matching as a new approach for performing generic matching We have proposed an implementation of semantic matching using SAT Stanford University, October 31, 2003

30 Future Work Extend to a full graph matcher How to extract semantics from

30 Future Work Extend to a full graph matcher How to extract semantics from schemas Study how to take into account attributes and instances Develop an efficient implementation of the system Do a thorough testing of the system Stanford University, October 31, 2003

31 References Project website: http: //www. dit. unitn. it/~p 2 p/ F. Giunchiglia, P.

31 References Project website: http: //www. dit. unitn. it/~p 2 p/ F. Giunchiglia, P. Shvaiko “Semantic Matching”. Technical Report #DIT-03 -013. Also to appear in The Knowledge Engineering Review journal. Short version in proceedings of Semantic Integration workshop at ISWC’ 03. F. Giunchiglia, I. Zaihrayeu “Making peer databases interact – a vision for an architecture supporting data coordination” In Proc. Of the Conference of Information Agents (CIA 2002), Madrid, 2002. Stanford University, October 31, 2003