A Semantic Web Ontology for Research Community Oct
A Semantic Web Ontology for Research Community Oct. 25, 2006 In-Su Kang, Hanmin Jung, Seungwoo Lee, Pyung Kim, Heekwan Koo, Mikyoung Lee, Namang Kuh, Won-Kyung Sung Korea Institute of Science and Technology Information I nformation System R esearch Lab oratory Copyright © 2004 -2005, KISTI
Contents n Introduction n n Semantic web & ontology Previous works Ontology development methodology n Ontology for research community n n OFK ontology (for research community) Ontology schema n Ontology instance n Instance management by URI-server n Instance representation n I nformation System R esearch Lab oratory n Construction of OFK ontology n Conclusion 2 Copyright © 2004 -2006, KISTI
Introduction - background n Current web n n Human-oriented, syntactic o Only understood by persons o Disallow automatic processing Semantic web [Berners-Lee et al. , 2001] n Machine-oriented, semantic o n Semantic tags assigned to information units Prerequisite o Ontology o o Inference engine o I nformation System R esearch Lab oratory concept hierarchy for semantic tagging ontology validation & implicit knowledge extraction 3 Copyright © 2004 -2006, KISTI
Introduction - ontology (1/2) n Shared, formal, explicit conceptualization [Gruber’ 93, Borst’ 97] n n for concepts and relationships b/w concepts Two dimensions n Schema level: [Paper] [has. Author] [Person] n Instance level: ‘A Relational Model of …’ [has. Author] ‘Codd E. F. ’ Thing sub. Class. Of Organization sub. Class. Of is. Owned. By Paper has. Title Person is. With string has. Author Ontology Schema Ontology Instance I nformation System R esearch Lab oratory instance. Of Codd E. F. instance. Of has. Author A Relational Model of … 4 Copyright © 2004 -2006, KISTI
Introduction - ontology (2/2) n Relationship b/w concepts n Object property o o n Relationship b/w concepts Similar to relationship b/w entities in RDB Datatype property o o Relationship b/w concept and literal Similar to relationship b/w an entity and an attribute in RDB Object property Datatype property Thing sub. Class. Of Organization Integer has. Total. Pages sub. Class. Of is. Owned. By has. Kor. Title Paper has. Eng. Title I nformation System R esearch Lab oratory Person is. With has. Acceptance. Date has. Author 5 String Date Copyright © 2004 -2006, KISTI
Previous works - ontology development methodology n Focused on schema modeling n n Uschold and King’s method (1995), Grüninger and Fox’s method (1995), KACTUSbased method (1996), SENSUS-based (1997), METHONTOLOGY (1999), On-To. Knowledge method (2001) Need for instance modeling n Identification system o n Identity resolution o o I nformation System R esearch Lab oratory (e. g. ) SSN for persons, DOI for contents Synonymy problem o (e. g. ) John R. Smith vs. John Richard Smith Homonymy problem o (e. g. ) John R. Smith in Harvard Univ. vs. John R. Smith at MIT 6 Copyright © 2004 -2006, KISTI
Previous works – ontology for research community n Research area ontologies n KA 2 ontology [Benjamins, 1999] o o n SWRC (Semantic Web Resource Community) ontology [Sure et al. , 2005] o o n o I nformation System R esearch Lab oratory KA 2 -based Applied to creating social networks of researchers AKT reference ontology [http: //www. aktors. org/publications/ontology/] o n Modeling knowledge acquisition community (researchers, topics, etc. ) The first ontology for research area English AKT project (2000. Oct. ~) Inferring top-level researchers / organizations / researcher’s cluster Summary n Includes Person, Organization, Project, Publication in common n Does not address identity resolution for instance modeling (except AKT case) o (e. g. ) ambiguity of same-name authors 7 Copyright © 2004 -2006, KISTI
OFK ontology - overview n Motivation Support researchers over full life-cycle of research activity n OFK: Onto. Frame-K®(ver. 2006) n o n Design principles n Schema-level o o o n Language-independent Scenario-oriented o Do not include unnecessary elements from the viewpoint of application Ockham’s razor o Do not represent properties derivable from rules Instance-level o I nformation System R esearch Lab oratory Ontology framework for Knowledge/Korean/KISTI Separate management of instances o Through URI-server o Instance storing o Instance identity management o Integrity check 8 Copyright © 2004 -2006, KISTI
OFK ontology – schema (1/2) n Core classes n n I nformation System R esearch Lab oratory Person, Organization, Project, Outcomes (Paper, Patent, Report), Publication (Journal, Proceedings), Topic, Creators. Information, Location Object property n Outcomes – has. Creators. Information – Creators. Information n Outcomes – has. Originated. Project – Project n Outcomes – has. Publication – Publication n Outcomes – has. Topic – Topic n Creators. Information – has. Creator – Person n Creators. Information – has. Organization. Of. Creator – Organization n Project – has. Organization. Of. Funding. Project – Organization n Project – has. Organization. Of. Performing. Project – Organization n Organization – has. Location – Location n Person – has. Organization. Of. Person - Organization 9 Copyright © 2004 -2006, KISTI
OFK ontology – schema (2/2) n Creators. Information n Info. of a creator at the time when his/her outcome was written o o o Order of creator Person corresponding to a creator Organization of a creator o Different from organization of person Paper has. Creators. Information Creator. Information has. Organization. Of. Creator order. Of. Creator has. Creator Organization Integer I nformation System R esearch Lab oratory Person 10 has. Organization. Of. Person Copyright © 2004 -2006, KISTI
OFK ontology – instance n Identification system n Outcomes o n Person o n National Science & Tech. Personnel Identification system o 10 -digit unique number o (e. g. ) Clinton: ‘ 7010862430’ Organization o I nformation System R esearch Lab oratory KOI (Knowledge Object Identifier) o Proceedings paper: ‘KISTI 1. PCD. 0001234’ o Journal paper: ‘KISTI 1. JNL. 0000123’ o Patent: ‘KISTI 1. PTN. 0000012’ o Report: ‘KISTI 1. RPT. 0012345’ Organization code compiled by Korea Research Foundation o 6 -alphanumeric code o (e. g. ) Seoul National Univ. : ‘ 114800’ o (e. g. ) Korea Institute of Science and Tech. Info. : ‘ 9 R 9048’ 11 Copyright © 2004 -2006, KISTI
OFK ontology – instance management Extraction of Format check Bibliographic Field constraint check URI assignment Information Documents Title: A Storage Structure for Nested Relations Using Signatures Sukho Lee PER_4410022529 Author: Hwan-Seung Yong, Sukho Lee Jnl. of Sys. Arch. PUB_SOJ 000574 Publication: Journal of Systems Architecture … Volume(Issue)/Page: 43(5) / 245 -250 Year: 1997 Register to URI-sever (referential integrity check) (duplicate check) URI Server Data Generation of ontology instance URI Type Metadata PER_4410022529 Person Name: Sukho Lee, Organization: ORG_114800 PER_0000012345 Person Name: Hwan-Seung Yong, Organization: ORG_133600 OBJ_KISTI 1. JNL. 0000001 Outcomes Title: A Storage Structure for Nested Relations Using Signatures Publication: PUB_SOJ 000574, Year: 1997, Volume(Issue): 43(5), Page: 245 -250 I nformation System R esearch Lab oratory ORG_114800 Organization Name: Seoul National University ORG_9 R 9048 Organization Name: Ewha Womans University PUB: SOJ 000574 Publication Name: Journal of Systems Architecture TOP_030213 Topic name: database 12 Copyright © 2004 -2006, KISTI
OFK ontology – instance representation URI Server Data Title: A Storage Structure for Nested Relations Using Signatures Author: Hwan-Seung Yong, Sukho Lee URI Type Metadata PER_4410022529 Person Name: Sukho Lee, Organization: ORG_114800 PER_0000012345 Person Name: Hwan-Seung Yong, Organization: ORG_133600 OBJ_KISTI 1. JNL. 0000001 Outcomes Publication: PUB_SOJ 000574, Year: 1997, Volume(Issue): 43(5), Page: 245 -250 Publication: Journal of Systems Architecture Volume(Issue)/Page: 43(5) / 245 -250 Year: 1997 ORG_114800 Organization Name: Seoul National University ORG_9 R 9048 Organization Name: Ewha Womans University PUB_SOJ 000574 Publication Name: Journal of Systems Architecture TOP_030213 Topic name: database has. Publication has. Topic OUT_KISTI 1. JNL. 0000001 TOP_030213 has. Creators. Information PUB_SOJ 000574 has. Creators. Information order. Of. Creator 1 st I nformation System R esearch Lab oratory order. Of. Creators. Information_3477 has. Creator Title: A Storage Structure for Nested Relations Using Signatures has. Organization. Of. Creator PER_4410022529 Creators. Information_3478 2 nd has. Creator PER_0000012345 has. Organization. Of. Person ORG: 114800 ORG_133600 13 Copyright © 2004 -2006, KISTI
Construction of OFK ontology - schema n Statistics n # of classes: 21 n # of properties: 64 o o n # of rules: 22 o n # of object properties derived by rules: 14 Ontology creation n Ontology editing tool o n Protégé 3. 1. 1 Ontology description language o I nformation System R esearch Lab oratory # of datatype properties: 46 # of object properties: 18 W 3 C OWL DL (http: //www. w 3 c. org/) 14 Copyright © 2004 -2006, KISTI
Construction of OFK ontology - instance n Target bibliographic data n Proceedings of major conferences/workshops/symposiums o o n Held at Korea during 2002 through 2006 # of papers: 12, 016 # of RDF (Resource Description Framework) triples Type Class instance (105, 479) I nformation System R esearch Lab oratory # of RDF Triples Person 11, 390 Outcomes (Paper) 12, 016 Organization 12, 586 Publication (Proceedings) 449 Topics (thesaurus-based) 31, 719 Location 28, 741 Others (Project, Department, etc. ) 8, 578 Instance relationship 1, 553, 575 Total 1, 659, 054 15 Copyright © 2004 -2006, KISTI
Conclusion n Sharing of ontology construction experience n n OFK ontology for research area Need for ontology instance management n Proposal of URI-server as a separate instance store o o o I nformation System R esearch Lab oratory Instance storing Instance identity management Integrity check o Data integrity o Referential integrity o Duplicate detection 16 Copyright © 2004 -2006, KISTI
Thank you ! dbaisk@kisti. re. kr swlee@kisti. re. kr I nformation System R esearch Lab oratory 17 Copyright © 2004 -2006, KISTI
Two Sides of Semantic Web I nformation System R esearch Lab oratory Open domain Closed domain Assumption Open-world assumption Closed-world assumption Type of data General Web contents Legacy or security data Control of Instance integrity Don’t care Highly needed 18 Copyright © 2004 -2006, KISTI
- Slides: 18