CPECSC 580 Knowledge Management Dr Franz J Kurfess

  • Slides: 62
Download presentation
CPE/CSC 580: Knowledge Management Dr. Franz J. Kurfess Computer Science Department Cal Poly ©

CPE/CSC 580: Knowledge Management Dr. Franz J. Kurfess Computer Science Department Cal Poly © 2001 -2005 Franz J. Kurfess Knowledge Organization 1

Course Overview u u Introduction Knowledge Processing u u Knowledge Organization u u u

Course Overview u u Introduction Knowledge Processing u u Knowledge Organization u u u Classification, Categorization Ontologies, Taxonomies, Thesauri Knowledge Retrieval u u u Knowledge Acquisition, Representation and Manipulation Information Retrieval Knowledge Navigation Knowledge Presentation u Knowledge Visualization © 2001 -2005 Franz J. Kurfess u Knowledge Capture, Transfer, and Distribution u Usage u Exchange of Knowledge Access Patterns, User Feedback u Knowledge Techniques u Management Topic Maps, Agents u Knowledge Management Tools u Knowledge Management in Organizations Knowledge Organization 2

Overview Knowledge Organization u Motivation u Knowledge Organization Frameworks u Objectives u Chapter u

Overview Knowledge Organization u Motivation u Knowledge Organization Frameworks u Objectives u Chapter u u u Introduction Review of relevant concepts Overview new topics Terminology u Identification u u of Knowledge Object Selection Naming and Description u Categorization u u Feature-based Categorization Hierarchical Categorization © 2001 -2005 Franz J. Kurfess u u u Dublin Core Resource Description Framework Topic Maps u Case u u u Studies Northern Light EPA TRS Getty Vocabularies u Important Concepts and Terms u Chapter Summary Knowledge Organization 3

Logistics u u Introductions Course Materials u handouts u u Web page u u

Logistics u u Introductions Course Materials u handouts u u Web page u u readings Term Project u u lecture notes description deliverables e-group account roles in teams Homework Assignments u description assignment 1 © 2001 -2005 Franz J. Kurfess Knowledge Organization 4

Bridge-In u How do you organize your knowledge? u brain u paper u computer

Bridge-In u How do you organize your knowledge? u brain u paper u computer © 2001 -2005 Franz J. Kurfess Knowledge Organization 5

Motivation u effective utilization of knowledge depends critically on its organization u quick access

Motivation u effective utilization of knowledge depends critically on its organization u quick access u identification of relevant knowledge u assessment of available knowledge v source, reliability, applicability u knowledge organization is a difficult task, and requires complementary skills u expertise in the domain u knowledge organization skills v librarians © 2001 -2005 Franz J. Kurfess Knowledge Organization 7

Objectives u be able to identify the main aspects dealing with the organization of

Objectives u be able to identify the main aspects dealing with the organization of knowledge u understand knowledge organization methods u apply the capabilities of computers to support knowledge organization u practice knowledge organization on small bodies of knowledge u evaluate frameworks and systems for knowledge organization © 2001 -2005 Franz J. Kurfess Knowledge Organization 8

Identification of Knowledge u Object Selection u Naming and Description © 2001 -2005 Franz

Identification of Knowledge u Object Selection u Naming and Description © 2001 -2005 Franz J. Kurfess Knowledge Organization 10

Object Selection u what constitutes a “knowledge object” that is relevant for a particular

Object Selection u what constitutes a “knowledge object” that is relevant for a particular task or topic u physical object, document, concept u how can this object be made available in the system u example: library u is it worth while to add an object to the library’s collection u if so, how can it be integrated physical document: book, magazine, report, etc. u digital document: file, data base, Web page, etc. u © 2001 -2005 Franz J. Kurfess Knowledge Organization 11

Naming and Description u names serve two important roles u identification ideally, a unique

Naming and Description u names serve two important roles u identification ideally, a unique descriptor that allows the unambiguous selection of the object u often an ambiguous descriptor that requires context information u u location u especially in digital systems, names are used as “address” for an object u names, descriptions and relationships to related objects are specified in listings u dictionary, glossary, thesaurus, ontology, index © 2001 -2005 Franz J. Kurfess Knowledge Organization 12

Naming and Description Devices u type u dictionary u glossary u thesaurus u ontology

Naming and Description Devices u type u dictionary u glossary u thesaurus u ontology u index u issues u arrangement v of terms alphabetical, hierarchical u purpose v explanation, unique identifier, clarification of relationships to other terms, access to further information © 2001 -2005 Franz J. Kurfess Knowledge Organization 13

Dictionary u list of words together with a short explanation of their meanings, or

Dictionary u list of words together with a short explanation of their meanings, or their translations into another language u helpful for the identification of knowledge objects, and their distinction from related ones u each entry in a dictionary may be considered an atomic knowledge object, with the word as name and “entry point” u may provide cross-references to related knowledge objects u straightforward implementation in digital systems, and easy to integrate into knowledge management systems © 2001 -2005 Franz J. Kurfess Knowledge Organization 14

Glossary u list of words, expressions, or technical terms with an explanation of their

Glossary u list of words, expressions, or technical terms with an explanation of their meanings u usually restricted to a particular book, document, activity, or topic u provides a clarification of the intended meaning for knowledge objects u otherwise similar to dictionary © 2001 -2005 Franz J. Kurfess Knowledge Organization 15

Thesaurus u collection of synonyms (word sets with identical or similar meanings) u frequently

Thesaurus u collection of synonyms (word sets with identical or similar meanings) u frequently includes words that are related in some other way, e. g. antonyms (opposite meanings), homonyms (same pronounciation or spelling) u identifies u not and clarifies relationships between words so much an explanation of their meanings u may be used to expand search queries in order to find relevant documents that may not contain a particular word © 2001 -2005 Franz J. Kurfess Knowledge Organization 16

Thesaurus Types u knowledge-based u linguistic u statistical © 2001 -2005 Franz J. Kurfess

Thesaurus Types u knowledge-based u linguistic u statistical © 2001 -2005 Franz J. Kurfess [Liddy 2000] Knowledge Organization 17

Knowledge-based Thesaurus u manually constructed for a specific domain u intended for human indexers

Knowledge-based Thesaurus u manually constructed for a specific domain u intended for human indexers and searchers u contains synonyms (“use for” UF) v more general (“broader term” BT) v more specific (“narrower” NT) v otherwise associated words (“related term” RT) v u example: “data base management systems” UF data bases v BT file organization, management information systems v NT relational databases v RT data base theory, decision support systems v © 2001 -2005 Franz J. Kurfess [Liddy 2000] Knowledge Organization 18

Linguistic Thesaurus u contains explicit concept hierarchies of several increasingly specified levels u words

Linguistic Thesaurus u contains explicit concept hierarchies of several increasingly specified levels u words in a group are assumed to be (near-) synonymous u selection of the right sense for terms can be difficult u examples: Roget’s, Word. Net u �� often used for query expansion u synonyms (similar terms) u hyponyms (more specific terms; subclass) u hypernyms (more general terms; super-class) © 2001 -2005 Franz J. Kurfess [Liddy 2000] Knowledge Organization 19

The World Example 1: Linguistic Thesaurus Abstract Relations Space Physics Sensation in General Matter

The World Example 1: Linguistic Thesaurus Abstract Relations Space Physics Sensation in General Matter Sensation Intellect Vilition Affections Taste Smell Sight Hearing Touch Odor . 1 . 2 . 3 Fragrance . 4 . 5. 6 Stench Odorless . 7 . 9 . 8 Incense; joss stick; pastille; frankincense or olibanum; agallock or aloeswood; calambac © 2001 -2005 Franz J. Kurfess [Liddy 2000] Knowledge Organization 20

Example 2: Linguistic Thesaurus [Liddy 2000] © 2001 -2005 Franz J. Kurfess Knowledge Organization

Example 2: Linguistic Thesaurus [Liddy 2000] © 2001 -2005 Franz J. Kurfess Knowledge Organization 21

Query Expansion in Search Engines u look up each word in Word Net u

Query Expansion in Search Engines u look up each word in Word Net u if the word is found, the set of synonyms from all Synsets are added to the query representation u weigh each added word as 0. 8 rather than 1. 0 u results better than plain SMART u u variable performance over queries major cause of error: the use of ambiguous words’ Synsets u general thesauri such as Roget’s or Word. Net have not been shown conclusively to improve results u u u may sacrifice precision to recall not domain specific not sense disambiguated © 2001 -2005 Franz J. Kurfess [Liddy 2000, Voorhees 1993] Knowledge Organization 22

Statistical Thesaurus u automatic thesaurus construction u u u classes of terms produced are

Statistical Thesaurus u automatic thesaurus construction u u u classes of terms produced are not necessarily synonymous, nor broader, nor narrower rather, words that tend to co-occur with head term effectiveness varies considerably depending on technique used © 2001 -2005 Franz J. Kurfess [Liddy 2000] Knowledge Organization 23

Automatic Thesaurus Construction (Salton) u document collection based u u u based on index

Automatic Thesaurus Construction (Salton) u document collection based u u u based on index term similarities compute vector similarities for each pair of documents if sufficiently similar, create a thesaurus entry for each term which includes terms from similar document © 2001 -2005 Franz J. Kurfess [Liddy 2000] Knowledge Organization 24

Sample Automatic Thesaurus Entries 408 dislocation junction minority-carrier point contact recombine transition 409 blast-cooled

Sample Automatic Thesaurus Entries 408 dislocation junction minority-carrier point contact recombine transition 409 blast-cooled heat-flow heat-transfer 410 anneal strain © 2001 -2005 Franz J. Kurfess 411 coercive demagnetize flux-leakage hysteresis induct insensitive magnetoresistance square-loop threshold 412 longitudinal transverse [Liddy 2000] Knowledge Organization 25

Dynamic Automatic Thesaurus Construction u thesaurus short-cut u u u run at query time

Dynamic Automatic Thesaurus Construction u thesaurus short-cut u u u run at query time take all terms in the query into consideration at once look at frequent words and phrases in the top retrieved documents and add these to the query = automatic relevance feedback © 2001 -2005 Franz J. Kurfess [Liddy 2000] Knowledge Organization 26

Expansion by Association Thesaurus Query: Impact of the 1986 Immigration Law Phrases retrieved by

Expansion by Association Thesaurus Query: Impact of the 1986 Immigration Law Phrases retrieved by association in corpus - illegal immigration - statutes - amnesty program - immigration reform law - editorial page article - naturalization service - civil fines - new immigration law - legal immigration - employer sanctions © 2001 -2005 Franz J. Kurfess - applicability - seeking amnesty - legal status - immigration act - undocumented workers - guest worker - sweeping immigration law - undocumented aliens [Liddy 2000] Knowledge Organization 27

Index u listing of words that appear in a (set of) documents, together with

Index u listing of words that appear in a (set of) documents, together with pointers to the locations where they appear u provides a reference to further information concerning a particular word or concept u constitutes the basis for computer-based search engines © 2001 -2005 Franz J. Kurfess Knowledge Organization 28

Indexing u the process of creating an index from a set of documents u

Indexing u the process of creating an index from a set of documents u one of the core issues in Information Retrieval u manual indexing u controlled vocabularies, humans go through the documents u semi-automatic u humans are in control, machines are used for some tasks u automatic u statistical indexing u natural-language based indexing © 2001 -2005 Franz J. Kurfess Knowledge Organization 29

NLP-based Indexing u the computational process of identifying, selecting, and extracting useful information from

NLP-based Indexing u the computational process of identifying, selecting, and extracting useful information from massive volumes of textual data u for potential review by indexers u stand-alone representation of content u using Natural Language Processing © 2001 -2005 Franz J. Kurfess [Liddy 2000] Knowledge Organization 30

Natural Language Processing ua range of computational techniques for analyzing and representing naturally occurring

Natural Language Processing ua range of computational techniques for analyzing and representing naturally occurring texts u at one or more levels of linguistic analysis u for the purpose of achieving human-like language processing u for a range of tasks or applications © 2001 -2005 Franz J. Kurfess [Liddy 2000] Knowledge Organization 31

Levels of Language Understanding Pragmatic Discourse Semantic Syntactic Lexical Morphological © 2001 -2005 Franz

Levels of Language Understanding Pragmatic Discourse Semantic Syntactic Lexical Morphological © 2001 -2005 Franz J. Kurfess [Liddy 2000] Knowledge Organization 32

What can NLP Indexing do? u u u phrase recognition disambiguation concept expansion ©

What can NLP Indexing do? u u u phrase recognition disambiguation concept expansion © 2001 -2005 Franz J. Kurfess [Liddy 2000] Knowledge Organization 33

Ontology u examines the relationships between words, and the corresponding concepts and objects u

Ontology u examines the relationships between words, and the corresponding concepts and objects u in practice, it often combines aspects of thesaurus and dictionary u frequently uses a graph-based visual representation to indicated relationships between words u used to identify and specify a vocabulary for a particular subject or task © 2001 -2005 Franz J. Kurfess Knowledge Organization 34

The Notion of Ontology u ontology explicit specification of a shared conceptualization that holds

The Notion of Ontology u ontology explicit specification of a shared conceptualization that holds in a particular context u captures a viewpoint on a domain: u taxonomies of species u physical, functional, & behavioral system descriptions u task perspective: instruction, planning © 2001 -2005 Franz J. Kurfess [Schreiber 2000] Knowledge Organization 35

Ontology Should Allow for “Representational Promiscuity” ontology parameter constraint -expression mapping rules viewpoint knowledge

Ontology Should Allow for “Representational Promiscuity” ontology parameter constraint -expression mapping rules viewpoint knowledge base B knowledge base A cab. weight + safety. weight = car. weight: rewritten as cab. weight < 500: © 2001 -2005 Franz J. Kurfess [Schreiber 2000] parameter(cab. weight) parameter(safety. weight) parameter(car. weight) constraint-expression( cab. weight + safety. weight = car. weight) constraint-expression( cab. weight < 500) Knowledge Organization 36

Ontology Types u domain-oriented v domain-specific v v v medicine => cardiology => rhythm

Ontology Types u domain-oriented v domain-specific v v v medicine => cardiology => rhythm disorders traffic light control system domain generalizations v components, organs, documents u task-oriented v task-specific v v configuration design, instruction, planning task generalizations v problems solving, e. g. upml u generic v v ontologies “top-level categories” units and dimensions © 2001 -2005 Franz J. Kurfess [Schreiber 2000] Knowledge Organization 37

Using Ontologies u ontologies needed for an application are typically a mix of several

Using Ontologies u ontologies needed for an application are typically a mix of several ontology types u technical manuals device terminology: traffic light system v document structure and syntax v instructional categories v u e-commerce u raises need for u modularization u integration import/export v mapping v © 2001 -2005 Franz J. Kurfess [Schreiber 2000] Knowledge Organization 38

Domain Standards and Vocabularies As Ontologies u example: Art and Architecture Thesaurus (AAT) u

Domain Standards and Vocabularies As Ontologies u example: Art and Architecture Thesaurus (AAT) u contains ontological information u AAT: structure of the hierarchy u structure u not explicit u can u u be made available as an ontology with help of some mapping formalism u lists u needs to be “extracted” of domain terms are sometimes also called “ontologies” implies a weaker notion of ontology scope typically much broader than a specific application domain example: domain glossaries, wordnet contain some meta information: hyponyms, synonyms, text © 2001 -2005 Franz J. Kurfess [Schreiber 2000] Knowledge Organization 39

Ontology Specification u many different languages u KIF u Ontolingua u Express u LOOM

Ontology Specification u many different languages u KIF u Ontolingua u Express u LOOM u UML u XML to the rescue: Web Ontology Language (OWL) u common basis u class (concept) u subclass with inheritance u relation (slot) © 2001 -2005 Franz J. Kurfess [Schreiber 2000] Knowledge Organization 40

Art & Architecture Thesaurus used for indexing stolen art objects in European police databases

Art & Architecture Thesaurus used for indexing stolen art objects in European police databases © 2001 -2005 Franz J. Kurfess [Schreiber 2000] Knowledge Organization 41

AAT Ontology description universe instance of 1+ object 1+ description dimension value set descriptor

AAT Ontology description universe instance of 1+ object 1+ description dimension value set descriptor value set in dimension 1+ descriptor 1+ object type 1+ object class has descriptor 1+ value © 2001 -2005 Franz J. Kurfess class of has feature class constraint [Schreiber 2000] Knowledge Organization 42

Document Fragment Ontologies: Instructional © 2001 -2005 Franz J. Kurfess [Schreiber 2000] Knowledge Organization

Document Fragment Ontologies: Instructional © 2001 -2005 Franz J. Kurfess [Schreiber 2000] Knowledge Organization 43

Domain Ontology of a Traffic Light Control System © 2001 -2005 Franz J. Kurfess

Domain Ontology of a Traffic Light Control System © 2001 -2005 Franz J. Kurfess [Schreiber 2000] Knowledge Organization 44

Two Ontologies of Document Fragments © 2001 -2005 Franz J. Kurfess [Schreiber 2000] Knowledge

Two Ontologies of Document Fragments © 2001 -2005 Franz J. Kurfess [Schreiber 2000] Knowledge Organization 45

Ontology for E-commerce © 2001 -2005 Franz J. Kurfess [Schreiber 2000] Knowledge Organization 46

Ontology for E-commerce © 2001 -2005 Franz J. Kurfess [Schreiber 2000] Knowledge Organization 46

Top-level Categories: Many Different Proposals Chandrasekaran et al. (1999) © 2001 -2005 Franz J.

Top-level Categories: Many Different Proposals Chandrasekaran et al. (1999) © 2001 -2005 Franz J. Kurfess [Schreiber 2000] Knowledge Organization 47

A Few Observations about Ontologies u u u Simple ontologies can be built by

A Few Observations about Ontologies u u u Simple ontologies can be built by non-experts v Consider Verity’s Topic Editor, Collaborative Topic Builder, GFP interface, Chimaera, etc. Ontologies can be semi-automatically generated v from crawls of site such as yahoo!, amazon, excite, etc. v Semi-structured sites can provide starting points Ontologies are exploding (business pull instead of technology push) v most e-commerce sites are using them - Google, My. Simon, Affinia, Amazon, Yahoo! Shopping, etc. v Controlled vocabularies (for the web) abound - SIC codes, UMLS, UN/SPSC, Open Directory, Rosetta Net, … v DTDs, schemata are making more ontology information available v Business ontologies are including roles v Businesses have ontology directors v “Real” ontologies are becoming more central to applications © 2001 -2005 Franz J. Kurfess [Mc. Guiness 2000] Knowledge Organization 48

Onto. Seek Example 1 © 2001 -2005 Franz J. Kurfess [Guarino et al. 2000]

Onto. Seek Example 1 © 2001 -2005 Franz J. Kurfess [Guarino et al. 2000] Knowledge Organization 49

Onto. Seek Screen Shot © 2001 -2005 Franz J. Kurfess [Guarino et al. 2000]

Onto. Seek Screen Shot © 2001 -2005 Franz J. Kurfess [Guarino et al. 2000] Knowledge Organization 50

Onto. Seek Disambiguation © 2001 -2005 Franz J. Kurfess [Guarino et al. 2000] Knowledge

Onto. Seek Disambiguation © 2001 -2005 Franz J. Kurfess [Guarino et al. 2000] Knowledge Organization 51

Onto. Broker Architecture © 2001 -2005 Franz J. Kurfess [Studer. 2000] Knowledge Organization 52

Onto. Broker Architecture © 2001 -2005 Franz J. Kurfess [Studer. 2000] Knowledge Organization 52

Onto. Pad © 2001 -2005 Franz J. Kurfess [Studer. 2000] Knowledge Organization 53

Onto. Pad © 2001 -2005 Franz J. Kurfess [Studer. 2000] Knowledge Organization 53

Query Interface © 2001 -2005 Franz J. Kurfess [Studer. 2000] Knowledge Organization 54

Query Interface © 2001 -2005 Franz J. Kurfess [Studer. 2000] Knowledge Organization 54

Hyperbolic View Interface © 2001 -2005 Franz J. Kurfess [Studer. 2000] Knowledge Organization 55

Hyperbolic View Interface © 2001 -2005 Franz J. Kurfess [Studer. 2000] Knowledge Organization 55

Categorization u Feature-based Categorization u Hierarchical Categorization © 2001 -2005 Franz J. Kurfess Knowledge

Categorization u Feature-based Categorization u Hierarchical Categorization © 2001 -2005 Franz J. Kurfess Knowledge Organization 56

Hierarchical Categorization ua set of objects is divided into smaller and smaller subset, forming

Hierarchical Categorization ua set of objects is divided into smaller and smaller subset, forming a hierarchical structure (tree) with the elementary objects as leaf nodes u typically one feature is used to distinguish one category from another u often constitutes a relatively stable “backbone” of a knowledge organization scheme u re-organization requires a major effort © 2001 -2005 Franz J. Kurfess Knowledge Organization 57

Feature-based Categorization u objects or documents are assigned to categories according to commonalties in

Feature-based Categorization u objects or documents are assigned to categories according to commonalties in specific features u can be used to dynamically group objects into categories that are of interest for a particular task or purpose u re-organization © 2001 -2005 Franz J. Kurfess is easy with computer support Knowledge Organization 58

Knowledge Organization Frameworks u Dublin Core u Resource Description Framework u Topic Maps ©

Knowledge Organization Frameworks u Dublin Core u Resource Description Framework u Topic Maps © 2001 -2005 Franz J. Kurfess Knowledge Organization 59

Case Studies u Northern Light u EPA TRS u Getty Vocabularies u RDF u

Case Studies u Northern Light u EPA TRS u Getty Vocabularies u RDF u Semantic Web © 2001 -2005 Franz J. Kurfess Knowledge Organization 60

Post-Test © 2001 -2005 Franz J. Kurfess Knowledge Organization 61

Post-Test © 2001 -2005 Franz J. Kurfess Knowledge Organization 61

Important Concepts and Terms u u u u agent automated reasoning belief network cognitive

Important Concepts and Terms u u u u agent automated reasoning belief network cognitive science computer science hidden Markov model intelligence knowledge representation linguistics Lisp logic machine learning microworlds © 2001 -2005 Franz J. Kurfess u u u u natural language processing neural network predicate logic propositional logic rational agent rationality Turing test Knowledge Organization 63

Summary Knowledge Organization © 2001 -2005 Franz J. Kurfess Knowledge Organization 64

Summary Knowledge Organization © 2001 -2005 Franz J. Kurfess Knowledge Organization 64

© 2001 -2005 Franz J. Kurfess Knowledge Organization 65

© 2001 -2005 Franz J. Kurfess Knowledge Organization 65