YAGO A Core of Semantic Knowledge Fabian M
YAGO – A Core of Semantic Knowledge Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum (Max-Planck Institute for Computer Science Saarbrücken/Germany) Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 1
Overview ﺭ Motivation ﺭ The Yago ontology ﺭ Content ﺭ Model ﺭ Extension ﺭ Conclusion Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 2
The Truth about Elvis is alive! Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 3
The Truth about Elvis is alive! He works as an astronaut in NASA's special security program Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 4
Usual solution Which NASA astronaut was born when Elvis was born? Yields only rubbish. Reasons: 1. Google participates in the conspiracy 2. Google does not search knowledge, but Web sites Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 5
Solution: An ontology astronaut born Fabian M. Suchanek 1935 born YAGO - A Core of Semantic Knowledge is an ? 6
Solution: An ontology entity subclass person subclass is a astronaut born means "Elvis Presley" Fabian M. Suchanek 1935 born is a ? means "The King" YAGO - A Core of Semantic Knowledge 7
Solution: An ontology entity subclass Classes person Relations is a astronaut born Individuals means Words subclass "Elvis Presley" Fabian M. Suchanek 1935 born is a ? means "The King" YAGO - A Core of Semantic Knowledge 8
Where do we get the ontology from? Previous approaches: ﺭ Assemble the ontology manually (Word. Net, SUMO, Gene. Ontology) Problems: Usually low coverage (MPI is in none of these) ﺭ Extract the ontology from corpora (e. g. the Web) (Know. It. All, Espresso, Snowball, LEILA) Problem: Usually low accuracy (50%-92%) Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 9
Where do we get the ontology from? YAGO approach: Assemble the ontology from Wikipedia (=> good coverage) Use the category system of Wikipedia (=> good accuracy) Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 10
Exploiting the Wikipedia category system Elvis Pr born blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter 1935 Exploit relational categories Categories: 1935_births Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 11
Exploiting the Wikipedia category system American_singer Elvis Pr is a born blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter Categories: 1935 Exploit relational categories Exploit conceptual categories American_singers Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 12
Exploiting the Wikipedia category system Disputed_article American_singer is a Elvis Pr born blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter Categories: Disputed_articles Fabian M. Suchanek 1935 Exploit relational categories Exploit conceptual categories Avoid administrational categories YAGO - A Core of Semantic Knowledge 13
Exploiting the Wikipedia category system Rock'n_Roll_Music American_singer Elvis Pr is a born blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter Categories: Rock'n_Roll_Music Fabian M. Suchanek 1935 Exploit relational categories Exploit conceptual categories Avoid administrational categories Avoid thematic categories YAGO - A Core of Semantic Knowledge 14
The Upper Model entity person ? American_singer is a born Fabian M. Suchanek 1935 YAGO - A Core of Semantic Knowledge 16
The Upper Model: From Wikipedia? Business Social_group People_by_occupation ? American_singer is a born Fabian M. Suchanek 1935 YAGO - A Core of Semantic Knowledge 17
The Upper Model: From Word. Net? Person#3 Singer#1 . . . Singer#17 American_singer is a born Fabian M. Suchanek 1935 YAGO - A Core of Semantic Knowledge 18
The Upper Model: From Word. Net? Person#3 Origin#7 Singer#1 . . . Singer#17 American_singers_of_Jewish_origin is a born Fabian M. Suchanek 1935 YAGO - A Core of Semantic Knowledge 19
The YAGO ontology Person#3 subclass Singer#1 subclass means "singer" American_singer is a born Fabian M. Suchanek means 1935 "Elvis Presley" YAGO - A Core of Semantic Knowledge 20
The YAGO ontology: Accuracy Relation subclass Accuracy is a 94. 54% +/- 2. 36% family. Name 97. 81% +/- 1. 75% given. Name 97. 62% +/- 2. 08% established. In 90. 84% +/- 4. 28% born. In. Year 93. 14% +/- 3. 71% died. In. Year 98. 72% +/- 1. 30% located. In 98. 41% +/- 1. 52% politician. Of 92. 43% +/- 3. 93% written. In. Year 94. 35% +/- 3. 33% has. Won. Prize 98. 47% +/- 1. 57% Fabian M. Suchanek 97. 70% +/- 1. 59% YAGO - A Core of Semantic Knowledge 21
6, 000 The YAGO ontology: Number of Facts Ontologies should not be judged purely by the number of facts! This is just an informational overview. 2, 000 30, 000 60, 000 200, 000 300, 000 Know. It. All SUMO Word. Net Open. Cyc Fabian M. Suchanek Cyc YAGO - A Core of Semantic Knowledge Yago 22
The Yago Model: Why binary is not enough singer (Elvis, is_a, singer) (But only from 1953 to 1977) is a (We know this from Wikipedia) Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 23
The Yago Model: Why binary is not enough singer #1 (Elvis, is_a, singer) time is a source Fabian M. Suchanek 1953 -1977 #2 (#1, time, 1953 -1977) #3 (#1, source, Wikipedia) Wikipedia YAGO - A Core of Semantic Knowledge 24
The Yago model formally A YAGO ontology over ﺭ a set of relations R ﺭ a set of common entities C #1 (Elvis, is_a, singer) ﺭ a set of fact identifiers I #2 (#1, time, 1953 -1977) is a function #3 (#1, source, Wikipedia) I (R C I) R (R I C) We can talk about ﺭ facts (#1, source, Wikipedia) ﺭ additional arguments (#1, time, 1953 -1977) ﺭ relations (time, has. Range, time_interval) Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 25
The Yago model: Logical aspects Axioms: person (x, is_a, y) subclass singer is a Fabian M. Suchanek is a (y, subclass, z) => (x, is_a, z). . . YAGO - A Core of Semantic Knowledge 26
The Yago model: Logical aspects finite, unique f 1, f 2, f 3, f 4, f 5, f 6, f 7, f 8, f 9, f 10 Axioms: (x, is_a, y) derive facts (y, subclass, z) => (x, is_a, z) f 1, f 2, f 3, f 4, f 5 . . . Eliminate facts f 1, f 2, f 3 Fabian M. Suchanek finite, unique YAGO - A Core of Semantic Knowledge 27
Extending the Ontology Whom did Elvis marry? X married Y Elvis married Priscilla Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 28
Extending the Ontology with LEILA Whom did Elvis marry? subj obj X married Y subj obj Elvis, the great rock star, married Priscilla Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 29
Extending the Ontology (YAGO) Information Extraction (LEILA) Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 30
The Truth about Elvis Which astronaut was born in the same year as Elvis? http: //www. mpi-inf. mpg. de/~suchanek/downloads/yago/ Enter your Yago Query: "Elvis Presley" born. In. Year $year $astro born. In. Year $year 20 results $astro isa astronaut Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 31
The Truth about Elvis Which astronaut codenamed "Roger" was born in the same year as Elvis? http: //www. mpi-inf. mpg. de/~suchanek/downloads/yago/ Enter your Yago Query: "Elvis Presley" born. In. Year $year $astro born. In. Year $year "Roger" given. Name. Of $astro isa astronaut Fabian M. Suchanek $astro = Roger_Chaffee YAGO - A Core of Semantic Knowledge 32
Conclusions ﺭ Yago bases on a logically clean model ﺭ Yago has an accuracy of around 95% ﺭ Yago is 3 times larger than the largest competitor ﺭ Elvis is alive Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 33
Reference For all details, please refer to our technical report "Yago – A Core of Semantic Knowledge" (Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum) available at http: //www. mpii. mpg. de/~suchanek Bib. Tex: @TECHREPORT{yagotr, AUTHOR = {Suchanek, Fabian and Kasneci, Gjergji and Weikum, Gerhard}, TITLE = {Yago: A Core of Semantic Knowledge}, TYPE = {Research Report}, INSTITUTION = {Max-Planck-Institut f{"u}r Informatik}, ADDRESS = {Stuhlsatzenhausweg 85, 66123 Saarbr{"u}cken, Germany}, NUMBER = {MPI-I-2006 -5 -006}, YEAR = {2006} } Fabian M. Suchanek YAGO - A Core of Semantic Knowledge 34
- Slides: 33