MOMA A Mappingbased Object Matching System Andreas Thor
MOMA - A Mapping-based Object Matching System Andreas Thor, Erhard Rahm University of Leipzig, Germany http: //dbs. uni-leipzig. de
Motivation w Object Matching n n Identifying equal objects in (different) data sources Most research for relational data w Matching for ad-hoc data integration n Dynamic information fusion User-oriented Web 2. 0 applications Trade-off: Match quality vs. time (run time & set-up time)
MOMA Framework w MOMA = Mapping-based Object Matching n n Framework for object matching Extensible matcher library w Matching for ad-hoc data integration n n Generic object representation Instance-based mappings w Key features n n n Combination of matchers / mappings Re-use of mappings Easy and flexible definition of match workflows
Objects and instance-based mappings Publication@ACM Id 1066157. 1066283 Title Schema and ontology matching with COMA++ Sourc e Association. Mapping . . . Publication@A Author@A CM CM 1066157. 1066 283 P 729451 1066157. 1066 283 P 707877 . . . Object International Conference. . . instance. . . Same. Mapping Publication@A Publication@DBLP CM Sim 1066157. 1066 283 0. 9 conf/sigmod/Aumueller. D MR 05
MOMA Architecture w Matching = generation of a Same-Mapping LDSA A LDSB Match Workflow Matcher 1 Matcher 2. . . Matcher n Mappin g Cache B Matcher Library Matcher implementation (e. g. , Attribute based) Match Workflows Mapping Combiner Mapping Operator Sele ction Compose, Threshold, Merge, . . . Best-N, . . . Mapping Reposito ry Same Mapping
Match Strategies: Merge & Compose 1. Merge A 1 map 2 Attribute-based Matcher A 2 map 2 • Overcome shortcomings (e. g. , recall) 2. Compose A 1 map 1 A 3 map 2 dblp A 2 • Efficient re-use of mappings • Compose result can be refined p 1 p 2 p 4 p‘‘ 1 p‘‘ 2 p‘‘ 4 p‘ 1 p‘ 2 p‘ 3
Match Strategies: Neighborhood B 1 map 2 map 1 A 1 B 2 map 3 A 2 p 1 p 2. . . pn v 1 p‘ p‘ 2 1 p‘n. . . dblp v‘ 1 Same-Mapping based on „similarity of the associated objects“ Þ Compose and sim-value ≈ #compose paths Generic matcher: PROCEDURE nh. Match ($Asso 1, $Same 2, $Asso 3) • Source- & mapping$Temp : = compose ($Asso 1, $Same 2, Min, Average); independent $Result: = compose ($Temp, $Asso 3, Min, Relative); • Re-use of existing RETURN $Result; mappings END • Very good results for 1: N relationship (e. g. , Venue-Publication) • Restriction of matching space for N: 1 (Publication-Venue) and N: M (Author-Publication)
Summary & Future Work w MOMA-Framework n n n Combination of matchers / mappings Re-use of mappings Flexible definition of match workflows w Prototype implementation based on i. Fuice n Evaluation for bibliographic domain w Dynamic information fusion for Web 2. 0 n n Re-use enables collaborative approach Flexible workflows allow quick set-up of data integration services mash-up service
- Slides: 8