Knowledge Base Diagnostics Richard Fikes Stanford KSL Adam
Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge) Mala Mehrotra (Pragati Synergetic Research Inc. ) Yolanda Gil (USC ISI) Deborah Mc. Guinness (Stanford KSL) 10/18/01 1 Knowledge Systems Laboratory, Stanford University
Knowledge Evolution Tools u KB development requires knowledge evolution Debugging, refining, structuring, modularizing, … u Power tools are needed to support KB evolution 4 KB diagnosis > Bugs, omissions, heuristic warnings, architectural advice 4 KB partitioning > To enable effective reasoning > To produce reusable KB building blocks 4 KB merging > To enable interoperation of KBs with overlapping content u KSL is developing knowledge evolution tools 2 Knowledge Systems Laboratory, Stanford University
Chimaera u A Knowledge Evolution Tool Environment 4 Tools for KB diagnosis and merging u Available as a Web service or an OKBC client 4 www. ksl. stanford. edu/software/chimaera 4 Usable from a Web browser 4 Online user manual, tutorial, and demonstration movie u Performs KB diagnostics in batch mode 4 Uploads and analyzes user’s KB 4 Accepts KBs in OKBC, KIF, MELD, RDF, DAML, … 4 Provides results as HTML pages linked to frames and axioms 4 Provides user selectable set of diagnostic tests u Analyzes both the structure and content of a KB 4 Uses reasoners to analyze content 3 Knowledge Systems Laboratory, Stanford University
Classification of Diagnostic Results u Errors 4 Logical inconsistencies E. g. , contradictory type constraints 4 Content structure errors E. g. , terms used but not defined u Anomalies 4 Missing information E. g. , type constraints 4 Redundancies E. g. , redundant superclass and type links 4 Extraneous structure or content E. g. , terms defined but not used u Summaries E. g. , counts of term references u Suggestions E. g. , use consistent naming conventions 4 Knowledge Systems Laboratory, Stanford University
“Background” Reasoning Analysis u Reasoning diagnostics that may take substantial time 4 Performed in background 4 Results incrementally posted on Web page 4 Completion notification sent to user via e-mail u Example reasoning diagnostics 4 Redundant axioms that are inferred by the KB (anomaly) 4 Inconsistent axioms whose negations are inferred by the KB (error) 4 Determine which relations in KB are primitive and non-primitive (summary) > Show relations on which each non-primitive relation depend 4 Determine classes that are disjoint (suggest adding results to KB) 4 Derive subclass and instance links (suggest adding links to KB) I. e. , classification and recognition 4 Suggest reordering of an implication’s antecedents based on number of inferable instances of each antecedent (suggestion) 5 Knowledge Systems Laboratory, Stanford University
Integration Into SHAKEN u Chimaera is a KB diagnostics tool in the SHAKEN system 4 Used to diagnose both pump priming and SME KBs u OKBC was used to do the integration 4 Chimaera is an OKBC client > Interacts with any OKBC server using the OKBC API > The Chimaera Web service uses Ontolingua as its OKBC server 4 SRI added an OKBC wrapper to the KM system > Enabled KM to be an OKBC server usable by OKBC clients > Enabled Chimaera’s diagnostics to run directly on KM KBs 6 Knowledge Systems Laboratory, Stanford University
Chimaera Useful To SRI Team “Overall, we found that Chimaera was quite useful. It found 2 concepts (Indole and Imidazole) that were corrupted, several occurrences of redundant superclasses, and several incorrect domain and range constraints (due to our poor representation of "Information"). … We're currently fixing the bugs it revealed. It would be helpful if we could run Chimera on the component library frequently. ” – Bruce Porter 7 Knowledge Systems Laboratory, Stanford University
Next Steps: SME-Oriented Support u Provide interactive repair oriented follow-up to diagnostics 4 Identify KB content on which diagnosis result is based 4 Suggest repairs or repair strategies 4 Guide user through repair procedure u Examples 4 Class is a direct subclass of “THING” > Provide direct subclasses of THING as candidate superclasses > Step down through the class hierarchy 4 Class has redundant superclass links > Suggest removal of link(s) to most general classes 4 Type, cardinality, or bounds conflict > Suggest changing local conflicting constraint(s) 4 Missing information > Initiate acquisition dialogues for missing information 8 Knowledge Systems Laboratory, Stanford University
Next Steps: Architectural Analysis u Summarize architectural 4 Percentage of features of a KB > Relations that are functions > Axioms that are propositional, first order, higher order > Axioms that are not horn clauses 4 Distribution of > Axioms by type (using the HPKB, RKF types) > Axiom lengths by number of literals > Functions by number of arguments > Relations by number of arguments > Direct subclasses per class > Direct subproperties per property > Restrictions per object > Property values per object 9 Knowledge Systems Laboratory, Stanford University
Next Steps: Partitioning and Beyond u Integration of KB partitioning tools into Chimaera 4 Provide automatic KB partitioning to enhance usability u Automatic running of test cases E. g. , queries and expected answers 4 Support regression testing of evolving KB 4 Provide result summaries from failed tests u Help with typographical errors 4 Spelling correction for undefined names E. g. , classes, slots, relations, functions, constants 4 Spelling correction for anomalously occurring variables > Suggest is the same as another variable in the sentence 10 Knowledge Systems Laboratory, Stanford University
Summary u KSL is developing Chimaera to support KB evolution u Chimaera was integrated into the SHAKEN Y 1 system Using OKBC(!) u Incrementally adding diagnostics E. g. , “background” diagnostics that use sophisticated reasoning u Next steps 4 KB partitioning tools 4 Repair dialogues for SMEs 4 KB architectural analysis 4 Regression testing 11 Knowledge Systems Laboratory, Stanford University
Role of Diagnostics in Systems u KE support u SME support u Increase productivity (“lightly trained”) 4 Step in managing KB development u Focus attention (e. g. , redundant links) u Evaluation support 4 Diagnose KBs produced during evaluation u Batch mode 4 Foreground 4 Background u 12 Changes in “patterns” in the KB between versions Knowledge Systems Laboratory, Stanford University
Sharing Diagnostics Information u Diagnostic specifications 4 Logical specifications 4 English specifications 4 Test cases Diagnostic classifications u Learnings u Tricks of the trade u Sharing facilitators: u 4 Working group 4 Mailing list u Findings data 4 Author, group, or team specific Repair strategies u Alignments during collaborative development u 13 Knowledge Systems Laboratory, Stanford University
Developer Needs and Desires u Reasoner-specific diagnostics u Highly informative diagnostic results u Reporting architectural bias in a KB 4 Binary versus higher order relations 4 First order versus higher order axioms > Weakly versus strongly higher order 4 Disjunctions or conjunctions 4 Existential versus universal quantifiers 4 Frames to axioms ratios 4 Horn clauses 4 Axiom lengths 4 Functions 14 u Confusion of existential and universal quantifiers u Type restrictions too general u Misspelling of variables Knowledge Systems Laboratory, Stanford University
Developer Needs and Desires u Domain-specific tests u Semantic tests u Maintainability measures u Recognizing typographical errors u Spell check undefined or unused terms u Redefining (e. g. , breaking up) a predicate 4 Large scale modification techniques u 15 Prioritizing diagnostics Knowledge Systems Laboratory, Stanford University
Integration Issues u Architecture 4 Use hosted services (like KSL) 4 Integrate special code 4 Take specifications from library u API u Interaction Mode - Batch versus Interactive/Repair u Translation issues 4 One major use of diagnostics is also in testing translators 4 Certain translations need to be done to do better analysis u 16 Output integration Knowledge Systems Laboratory, Stanford University
Evaluation u Record types and numbers of errors 4 Comparing KBs produced by SMEs versus KEs 17 u Record use of repair strategies u Evaluate during testing u Feedback from SMEs about diagnostics Knowledge Systems Laboratory, Stanford University
Classification of Diagnostic Results u Errors 4 Logical inconsistencies 4 Content structure errors 4 (See Randy Davis thesis) u Anomalies 4 Missing information > Missing portions of descriptions 4 Redundancies 4 Extraneous structure or content u Summaries 4 Architectural biases u Suggestions 4 Stylistic suggestions 18 u Static versus operational tests u Use of expertise about KR paradigms Knowledge Systems Laboratory, Stanford University
Diagnostic Issues/Goals u Role of Diagnostics in Systems 4 KE support, SME support 4 Evaluators of KBs u How to Share Diagnostics 4 Working Group? 4 Logical specification, English descriptions, tests, … u Know the Main Contributors u Possible Diagnostics 4 What do users want? 4 What can tool builders provide? u Integration Issues u Developer Needs/Desires u Evaluation 19 Knowledge Systems Laboratory, Stanford University
The Role of KB Diagnostics u KE support u SME support u Increase productivity (“lightly trained”) u Mgmt of kb u Inference dependent quality improvement u Focus attention (ex. Redundant links) u Evaluation support u Abstract patterns – average fanout of specialization, statistics of number of uses of a predicate – big picture view u Version comparison u Regression testing 20 Knowledge Systems Laboratory, Stanford University
Diagnostic Sharing u Diagnostic specifications 4 Logical specifications 4 English specifications 4 Test cases u Diagnostic classifications u Taxonomy of errors – bottlenecks, 4 Quantification u Alignments across systems – inconsistencies among smes u Repair strategies u How informative a system is (core dump vs. useful explanation) u Learnings 21 Knowledge Systems Laboratory, Stanford University
Sharing facilities u Working group u Mailing list u Posting of papers u Utilize Teknowledge 22 Knowledge Systems Laboratory, Stanford University
biases u Binary vs. higher arity u First order vs higher order 4 Weakly vs strongly higher order u Universal over existential u Disjunction vs. conjunction u Frame-ism u Horn clauses u Lisp style u Relations -> functions u Depth vs. breadth in hierarchy u …. Maybe report in summarizations. . u At least document biases 23 Knowledge Systems Laboratory, Stanford University
Organizations/People u Cycorp – many special purpose - Kahlert u ISI – Why Not? – Chalupsky – KANAL – Gil - expect - Gil u Pragati – Clustering - Mehrotra u Stanford FRG/KSL – Partitioning – Mc. Carthy, Amir, Mc. Ilraith u Stanford KSL – Chimaera - Fikes, Mc. Guinness 24 Knowledge Systems Laboratory, Stanford University
Diagnostics u Errors – provable logical inconsistencies u Anomalies – redundancies, cycles, … u Summaries – word counts, … u Suggestions – naming conventions u Incompletenesses – explicit salient assertions or statistics u Stylistics - length of rule, … bad factoring, Randy davis – errors – incompleteness, inconsistent u Get this - Top ten list of things people do wrong in cyc - goolsbey Perspectives/units: Frame-like content vs. axioms vs. problem solving technology vs. learning to correct components 25 Knowledge Systems Laboratory, Stanford University
style u Static u Reasoner u Simulation / execution u Using examples u Summarization/improvements/critiquer 26 Knowledge Systems Laboratory, Stanford University
Integration Issues u Architecuture 4 Use hosted services (like KSL) 4 Integrate special code 4 Take specifications from library u API u Interaction Mode – Batch vs. Interactive/Repair u Translation issues 4 one major use of diagnostics is also in testing translators 4 Certain translations need to be done to do better analysis 4 Background ontologies – meld starter ontology u Output 27 integration Knowledge Systems Laboratory, Stanford University
Developer Needs/Desires Missing existentials Too high a type specification Variable name mismatch Semantic requests: Wrong semantic paradigm? Typos Spell check Large scale modification tools and their integration example removal/ fixing top level priotizing Diagnostics to minimize cost, ease maintenance 28 Knowledge Systems Laboratory, Stanford University
Evaluation u Record types of errors 4 Fine granularity u Kb differences across sme vs. ke developed ontologies across team u Record use of repair strategies… u Evaluate during testing… u Feedback from smes on features, usefulness, etc. u Attempt to keep extremely complete audit trails for future analysis u Important to be careful with diagnostic reporting 29 Knowledge Systems Laboratory, Stanford University
Action Items u Working Group u Diagnostics repository u Web site u Follow up briefing u Mailing list 30 Knowledge Systems Laboratory, Stanford University
Chimaera u A Knowledge Evolution Environment 4 Tools for KB diagnosis and merging u Available as a Web service 4 www-ksl-svc. stanford. edu www. ksl. stanford. edu/software/chimaera 4 Usable from a Web browser 4 Online user manual, tutorial, and demonstration movie 4 Provides user selectable set of diagnostic tests u Performs kb diagnostics in batch mode 4 Uploads and analyzes user’s KB 4 Accepts KBs in MELD, KIF, OKBC, DAML, RDF, XML, … 4 Provides results as HTML pages linked to frames and axioms u Analyzes both the structure and content of a KB 4 Uses hybrid reasoners to analyze content 4 Currently runs 28 diagnostic tests 31 Knowledge Systems Laboratory, Stanford University
Collection/Specification u Logical Specification of diagnostic u English Specification u Example kb that triggers diagnostic output 32 Knowledge Systems Laboratory, Stanford University
Classification of Diagnostic Results II u Axiom Analysis 4 Axiom Syntax Problems E. g. , no consequent to a implications 4 Axiom Redundancy E. g. , 1. A =>B 2. A=>C 3. C =>B means 1 is redundant 4 Axiom Variable Usage E. g. , Variable used in antecedent but not in consequent 4 Axiom Consistency E. g. , A => not A 4 Axiom Tautology E. g. , consequent repeats (portion of) antecedent 33 Knowledge Systems Laboratory, Stanford University
- Slides: 33