A Framework for OntologyBased Knowledge Management System Jiangning

A Framework for Ontology-Based Knowledge Management System Jiangning Wu Dalian University of Technology, China

Introduction Research Center of Knowledge Science & Technology, DUT • Introduction – Background – Problems – Solution – Focus – Contributions

Background Research Center of Knowledge Science & Technology, DUT • The goal of a general KMS is to provide the right knowledge to the right people at the right time and in the right format. • Through KMSs, users can access and utilize the rich sources of data, information and knowledge stored in different forms.

Problems Research Center of Knowledge Science & Technology, DUT • Traditional KMSs are based on the existing data repositories and users’ needs. • For knowledge discovering, users submit queries to the system and receive knowledge by keyword match. • But keyword-based systems cannot understand the meaning of data. They are inflexible and stifle for knowledge creation.

Solution Research Center of Knowledge Science & Technology, DUT • The emerging ontology-based KMSs can find the content-oriented knowledge that people really want. • The domain ontology is powerful in knowledge representation and associated inference.

Focus Research Center of Knowledge Science & Technology, DUT • We mainly focus on performing the activity for projects and domain experts matching. • In project management, it is not easy to choose an appropriate domain expert for a certain project if experts’ research areas and the contents of the projects are not understood very well.

Contributions Research Center of Knowledge Science & Technology, DUT • Our contributions are describing experts’ research areas and the contents of the projects by separated ontologies based on the same standard subject category of China. • So the matching problem is transformed into calculating the semantic similarities between ontologies.

Contributions Research Center of Knowledge Science & Technology, DUT • To calculate the similarity between documents, we propose an integrated method based on node-based method and edge-based method to solve this problem.

Ontology in KR Research Center of Knowledge Science & Technology, DUT • Ontology in Knowledge Representation – Ontology in General – T. R. Gruber – Why Ontology – Our Ontology

Ontology Research Center of Knowledge Science & Technology, DUT • Research on knowledge representation has been a focus of AI and IS disciplines for a number of years. • Much of contemporary research extends the seminal work within AI discipline, of which research in ontology has been one of the beneficiaries.

Ontology Research Center of Knowledge Science & Technology, DUT • Research in computational ontology has traditionally sought to develop structure for the purpose of knowledge subsumption. • The goal of such research aims to develop generic, reusable representations of domain ontology.

T. R Gruber Research Center of Knowledge Science & Technology, DUT • T. R. Gruber claimed: An ontology is an explicit specification of a conceptualization. The term is borrowed from philosophy, where an ontology is a systematic account of existence. • For knowledge-based systems, what “exists” is exactly that which can be represented.

Ontology Research Center of Knowledge Science & Technology, DUT • An ontology in short is an explicit description of a domain: – concepts – properties and attributes of concepts – constraints on properties and attributes – Individuals (often, but not always) • An ontology defines – a common vocabulary – a shared understanding

Why Ontology Research Center of Knowledge Science & Technology, DUT • To share common understanding of the structure of information – among people – among software agents • To enable reuse of domain knowledge – to avoid “re-inventing the wheel” – to introduce standards to allow interoperability

Why Ontology Research Center of Knowledge Science & Technology, DUT • To make domain assumptions explicit – easier to change domain assumptions (consider a genetics knowledge base) – easier to understand update legacy data • To separate domain knowledge from the operational knowledge – re-use domain and operational knowledge separately (e. g. , configuration based on constraints)

Our Ontology Research Center of Knowledge Science & Technology, DUT • The ontology is a collection of concepts and their relationships, and serves as a conceptualized vocabulary to describe an application domain. • In our study, it is created by means of Protege, which is developed by Stanford University.

Our Ontology Research Center of Knowledge Science & Technology, DUT • The initial concepts in our ontology are broadly extracted from the standard subject category of China. • To make the selected concepts more suitable for our concerned projects and domain experts, a tool called Concept Filler is developed, which is simply an interface to help domain experts assign proper concepts and weights manually.

Interface Research Center of Knowledge Science & Technology, DUT

Our Ontology Research Center of Knowledge Science & Technology, DUT • When specifying the concept, the corresponding weight value ranging from 0 to 1 is also assigned to itself aiming to distinguish its importance. • The relationships in an ontology are explicitly named which can reflect the context of the domain knowledge.

Relationships Research Center of Knowledge Science & Technology, DUT • Many types of relationships can be found in ontology construction as we have known, such as IS-A relation, Kindof relation, Part-of relation, Substance-of relation, and so on. • Since IS-A (hyponym / hypernym) relation is the most common concern in ontology presentation, only this kind of relation is therefore introduced in our research for simplification.

Our Ontology Procedures in the Development of the Chinese Ontology

Matching Method Research Center of Knowledge Science & Technology, DUT • Matching Method – Node-based Method – Edge-based Method – Shortcomings – Integrated Method

Considerations Research Center of Knowledge Science & Technology, DUT • Calculating the similarity between concepts based on the complex relationships is a challenging work. • Unfortunately no method can deal with the above problem effectively up to now. • Considering some similarity calculation methods have been developed based on the simplest relation - IS-A relation, only this kind of relation is retained in our study.

Node-based Method Research Center of Knowledge Science & Technology, DUT • Resnik used information content to measure the similarity. • His point is that the more information content two concepts share, the more similarity two concepts have.

Node-based Method Research Center of Knowledge Science & Technology, DUT The similarity of two concepts c 1 and c 2 is Considering many inherited concepts may have more than one senses, the above formula is modified as

Edge-based Method Research Center of Knowledge Science & Technology, DUT • Leacock and Chodorow summed up the shortest path length and converted this statistical distance to the similarity measure.

Shortcomings Research Center of Knowledge Science & Technology, DUT • Both node-based and edge-based methods only simply consider two concepts in the same concept tree without expanding to two lists of concepts in different concept trees. • However the fact is when we describe different documents in the same domain using ontology structures, homogeneous but heteromorphic concept trees are often formed.

Shortcomings Research Center of Knowledge Science & Technology, DUT • The matching problem to be solved here is calculating the similarity between two different concept trees, not between two concepts in the same tree. • So we have to develop a new method that can calculate the similarities between two lists of concepts in different trees, by which the quantified similarity value can show similar the documents are.

Shortcomings Research Center of Knowledge Science & Technology, DUT • The node-based method does not concern the distance between concepts. • From the four-hierarchy concept tree, we can see that if concepts C 21, C 31 and C 36 have the same sense and the equal frequency, we may get the following result according to the node-based method sim(C 21, C 31) = sim(C 21, C 36)

Shortcomings Research Center of Knowledge Science & Technology, DUT • However, it is obvious to see that concepts C 21 and C 31 are more similar since C 31 is the direct inheritor of C 21.

Shortcomings Research Center of Knowledge Science & Technology, DUT

Shortcomings Research Center of Knowledge Science & Technology, DUT • In contrast to the node-based method, the edge-based method only considers the relationships between concepts and ignores the weights of concepts. • Both concepts C 31 and C 32 respectively have only one edge with C 21. According to the edge-base method, the same similarity value can be obtained.

Shortcomings Research Center of Knowledge Science & Technology, DUT • But, if C 31 has bigger weight than C 32, C 31 is considered to be more important and the corresponding similarity value between C 31 and C 21 should be greater.

Integrated Method Research Center of Knowledge Science & Technology, DUT • Before conducting the proposed method, the documents related to projects and domain experts should be formalized first that results in two vectors containing the concepts with their frequencies.

Integrated Method Research Center of Knowledge Science & Technology, DUT • The similarity between cis and cjt • The modified similarity

Integrated Method Research Center of Knowledge Science & Technology, DUT • The similarity between two documents

Framework Research Center of Knowledge Science & Technology, DUT • • Ontologies Building Documents Formalization Similarity Calculation User Interface.

Framework Research Center of Knowledge Science & Technology, DUT

Evaluation Research Center of Knowledge Science & Technology, DUT • Two measures to verify our ontology-based KMS

Evaluation Research Center of Knowledge Science & Technology, DUT Precision

Evaluation Research Center of Knowledge Science & Technology, DUT Recall

Conclusions Research Center of Knowledge Science & Technology, DUT • An ontology-based method to match projects and domain experts is presented. • The prototype system we developed contains four modules: Ontology building, Document formalization, Similarity calculation and User interface.

Conclusions Research Center of Knowledge Science & Technology, DUT • We discuss node-based and edge-based approaches to computing the semantic similarity, and propose an integrated approach to calculating the semantic similarity between two documents. • The experimental results show that our ontology-based KMS performing the activity for projects and domain experts matching can reach better recall and precision.

Future Works Research Center of Knowledge Science & Technology, DUT • As mentioned previously, only the simplest relation “IS-A relation” is considered in our study. • When dealing with the more complex ontology whose concepts are restricted by logic or axiom, our method is not powerful enough to describe the real semantic meaning by merely considering the hierarchical structure.

Future Works Research Center of Knowledge Science & Technology, DUT • So the future work will be focused on the other kinds of relations that are used in ontology construction. • In other words, it will be an exciting and challenging work for us to compute the semantic similarity upon various relations in the future.

Research Center of Knowledge Science & Technology, DUT THANKS