Expert Systems What are Expert Systems An expert

  • Slides: 67
Download presentation
Expert Systems

Expert Systems

What are Expert Systems? “An expert systems is a computer system that operates by

What are Expert Systems? “An expert systems is a computer system that operates by applying an inference mechanism to a body of specialist expertise represented in the form of knowledge that manipulates this knowledge to perform efficient and effective problem solving in a narrow problem domain. ”

Emphasis is on Knowledge not Methods 1. Most difficult and interesting problems do not

Emphasis is on Knowledge not Methods 1. Most difficult and interesting problems do not have tractable algorithmic solutions 2. Human experts achieve outstanding performance because they are knowledgeable 3. Knowledge is a scarce (and therefore, valuable) resource It is better to call these systems: Knowledge-Based Systems

Fundamental Concepts • Knowledge consists of descriptions, relationships, and procedures in some domain •

Fundamental Concepts • Knowledge consists of descriptions, relationships, and procedures in some domain • Knowledge takes many forms and is often hard to categorize

Changing Focus in AI HIGH P r o g r a m Find general

Changing Focus in AI HIGH P r o g r a m Find general methods for problem-solving and use them to create general-purpose programs P o w e r LOW 1960 1970 1980 Time Frame

Changing Focus in AI HIGH P r o g r a m Find general

Changing Focus in AI HIGH P r o g r a m Find general methods to improve representation and search and use them to create specialized programs P o w e r LOW 1960 1970 1980 Time Frame

Changing Focus in AI HIGH P r o g r a m Use extensive,

Changing Focus in AI HIGH P r o g r a m Use extensive, high-quality, specific knowledge about some narrow problem area to create very specialized programs P o w e r LOW 1960 1970 1980 Time Frame

Heuristics vs. Algorithms It is great if we have algorithms but often a heuristic

Heuristics vs. Algorithms It is great if we have algorithms but often a heuristic will work almost as well at much less cost Prevent Skyjacking Algorithm Heuristic

Why should we not use human expertise? Human Expertise Artificial Expertise Perishable Permanent Difficult

Why should we not use human expertise? Human Expertise Artificial Expertise Perishable Permanent Difficult to transfer Easy to transfer Difficult to document Easy to document Unpredictable Consistent Expensive Affordable

Why should we keep using humans? Human Expertise Artificial Expertise Creative Uninspired Adaptive Needs

Why should we keep using humans? Human Expertise Artificial Expertise Creative Uninspired Adaptive Needs to be told Sensory experience Symbolic input Broad focus Narrow focus Commonsense knowledge Technical knowledge • knows certain things are true • while others are not • knows limits of knowledge

Knowledge Engineering The process of building an expert system Expert System Knowledge Engineer Domain

Knowledge Engineering The process of building an expert system Expert System Knowledge Engineer Domain Expert

Views of an Expert System: End-user Intelligent Program User Interface Data Base

Views of an Expert System: End-user Intelligent Program User Interface Data Base

Views of an Expert System: Knowledge Engineer Rules, Semantic Networks, Frames, and Facts General

Views of an Expert System: Knowledge Engineer Rules, Semantic Networks, Frames, and Facts General Problem Solving Knowledge: Intelligent Program Knowledge Base Inference Engine

Forms of Inference • Process of drawing conclusions based on facts known or thought

Forms of Inference • Process of drawing conclusions based on facts known or thought to be true • We commonly use three different type: 8 Deduction 8 Abduction 8 Induction

Deduction Reasoning from a known principle to an unknown, from the general to the

Deduction Reasoning from a known principle to an unknown, from the general to the specific, or from a premise to a logical conclusion Modus Ponens Modus Tolens > rules, theorems, models

Deduction - Example Suppose that we know: " X, swimming(X) -> wet(X) If we

Deduction - Example Suppose that we know: " X, swimming(X) -> wet(X) If we are now told: swimming(andy) Then we can derive (using Modus Ponens): wet(andy)

Abduction Used when generating explanations Is an unsound form of reasoning If we know:

Abduction Used when generating explanations Is an unsound form of reasoning If we know: Can we state: " X, swimming(X) -> wet(X) wet(alex) swimming(alex) ?

Induction Reasoning from particular facts or individual cases to a general conclusion This is

Induction Reasoning from particular facts or individual cases to a general conclusion This is the basis of scientific discovery Key technique in machine learning and knowledge acquisition IF THEN > generalization, observation

Example - Family Relationships Basic domain facts: child_of(alex, nicole). male(alex). male(phillip). male(nicholas). child_of(alina, nicole).

Example - Family Relationships Basic domain facts: child_of(alex, nicole). male(alex). male(phillip). male(nicholas). child_of(alina, nicole). child_of(nicholas, leah). child_of(phillip, leah). child_of(melanie, cathy). child_of(leslie, cathy). child_of(sarah, cathy). child_of(angela, cathy). female(alina). female(leah). female(nicole). female(angela). female(sarah). female(leslie). female(melanie). female(cathy).

Example - Family Relationships Rules: sisters(nicole, leah). sisters(X, Z) : - child_of(X, Y), child_of(Z,

Example - Family Relationships Rules: sisters(nicole, leah). sisters(X, Z) : - child_of(X, Y), child_of(Z, Y), female(X), female(Z). brothers(X, Z) : - child_of(X, Y), child_of(Z, Y), male(X), male(Z). (cont. )

Machine Learning (Induction from Examples)

Machine Learning (Induction from Examples)

What is learning? • “changes in a system that enable a system to do

What is learning? • “changes in a system that enable a system to do the same task more efficiently the next time” -Herbert Simon • “constructing or modifying representations of what is being experienced” -- Ryszard Michalski • “making useful changes in our minds” -- Marvin Minsky

What is learning? • Shorter Oxford Dictionary defines learning as: 8… to get knowledge

What is learning? • Shorter Oxford Dictionary defines learning as: 8… to get knowledge of (a subject) or skill (in art, etc) by study, experience or teaching. Also to commit to memory … • so learning involves 8 acquiring NEW knowledge 8 improving the use of EXISTING knowledge • i. e. , performance

Why learn? • understand improve human learning 8 learn to teach • discover new

Why learn? • understand improve human learning 8 learn to teach • discover new things 8 data mining • fill in skeletal information about a domain 8 incorporate new information in real time 8 make systems less “finicky” or “brittle” by making them better able to generalize

Why learn? • learning is considered to be a KEY element of AI •

Why learn? • learning is considered to be a KEY element of AI • any autonomous system MUST be able to learn and adapt • sometimes it is easier to `teach’ or `explain’ than to `program’ 8 e. g. , consider the difference in explaining tic-tac-toe and writing a program to play the game 8 e. g. , consider the difference in using a few example pictures to explain the difference between a lion and a tiger, and getting a computer to do likewise • any system that makes the same mistake twice is pretty STUPID 8 all systems (e. g. , o/s, database) should have some integral learning component

State of the Art • modest achievements • mostly isolated solutions to date •

State of the Art • modest achievements • mostly isolated solutions to date • but can 8 assist automatic knowledge acquisition 8 extract relevant knowledge from very large knowledge bases 8 abstract higher-level concepts out of data sets 8 … etc. • recent trend to integrated systems 8 combine various learning methods • induction, deduction, analogy, abduction • symbolic ML, neural networks, genetic algorithms, . . .

Components of a learning system

Components of a learning system

Evaluating Performance • several possible criteria 8 predictive accuracy of classifier 8 speed of

Evaluating Performance • several possible criteria 8 predictive accuracy of classifier 8 speed of learner 8 speed of classifier 8 space requirements • Most common criterion is Predictive Accuracy

Symbolic vs. Numeric • ML has traditionally concerned itself with symbolic representations 8 e.

Symbolic vs. Numeric • ML has traditionally concerned itself with symbolic representations 8 e. g. , [color = orange] rather than [wavelength = 600 nm] • concepts are inherently symbolic • required for human understanding and recognition 8 we think in linguistic terms (i. e. , symbols) and not in numbers 8 e. g. , bird : = has-wings & flies & has-beak & lays-eggs &. . . • the relationship between symbolic & numerical representations is still an open debate

Learning as Search • to learn a concept description, need to search through a

Learning as Search • to learn a concept description, need to search through a `hypothesis’ space 8 the space of possible concept descriptions • need a language to describe the concepts 8 the choice of language defines a large (possibly infinite) set of potential concept descriptions (i. e. , rules) • the task of the learning algorithm is to search this space in an efficient manner • the difficulty is how to ignore the vast majority of invalid descriptions without missing the useful one(s) • usually requires heuristic methods to prune the search

Summary • Decision Trees are widely used 8 easy to understand rationale 8 can

Summary • Decision Trees are widely used 8 easy to understand rationale 8 can out-perform humans 8 fast, simple to implement 8 handles noisy data well • Weaknesses 8 univariate (uses only 1 variable at a time) 8 batch (non-incremental)

Induction systems The power behind an intelligent system is knowledge. We can trace the

Induction systems The power behind an intelligent system is knowledge. We can trace the system success or failure to the quality of its knowledge. Difficult task: 1. Extracting the knowledge. 2. Encoding the knowledge. 3. Inability to express the knowledge formally.

Induction inducing general rules from knowledge contained in a finite set of examples. Induction

Induction inducing general rules from knowledge contained in a finite set of examples. Induction is the process of reasoning from a given set of facts to conclude general principles or rules. Induction looks for patterns in available information to infer reasonable conclusions.

Induction as search Induction can be viewed as a search through a problem space

Induction as search Induction can be viewed as a search through a problem space for a solution to a problem. The problem space is composed of the problem’s major concepts linked together by an inductive process that uses examples of the problem.

Induction The choice of representation for the desired function is probably the most important

Induction The choice of representation for the desired function is probably the most important issue. As well as affecting the nature of the algorithm, it can affect whether the problem is feasible at all. Is the desired function representable in the representation language? An example is described by the values of the attributes and the value of the goal predicate. We call the value of the goal predicate the classification of the example. The complete set of examples is called the training set.

Induction - first example Determine an appropriate gift on the basis of available money

Induction - first example Determine an appropriate gift on the basis of available money and the person’s age. Money and age will represent our decision factors (problem attributes). Money Age Gift Much Adult Car Much Child Computer Little Adult Toaster Little Child Calculator Age Money

Induction - decision trees A decision tree takes as input an object or situation

Induction - decision trees A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no “decision. ” Decision trees therefore represent Boolean functions. Each internal node in the tree corresponds to a test of the value of one of the properties, and the branches from the node are labeled with the possible values of the test. Each leaf node in the tree specifies the Boolean value to be returned if that leaf is reached.

Induction - decision trees Decision trees are implicitly limited to talking about a single

Induction - decision trees Decision trees are implicitly limited to talking about a single object. That is, the decision tree language is essentially propositional, with each attribute test being a proposition. We cannot use decision trees to represent tests that refer to two or more different objects. Decision trees are fully expressive within the class of propositional languages, that is, any Boolean function can be written as a decision tree. Have each row in the truth table for the function correspond to a path in the tree. The truth table is exponentially large in the number of attributes.

Supervised Concept Learning • given a training set of positive and negative examples of

Supervised Concept Learning • given a training set of positive and negative examples of a concept – construct a description that will accurately classify future examples. – Learn some good estimate of function f given a training set: { (x 1, y 1), (x 2, y 2), . . . (xn, yn)} where each yi is either + (positive) or - (negative) • inductive learning generalizes from specific facts – cannot be proven true, but can be proven false • falsity preserving – is like searching an Hypothesis Space H of possible f functions – bias allows us to pick which h is preferable – need to define a metric for comparing f functions to find the best

Inductive learning framework • raw input is a feature vector, x, that describes the

Inductive learning framework • raw input is a feature vector, x, that describes the relevant attributes of an example • each x is a list of n (attribute, value) pairs – x = (person=Sue, major=CS, age=Young, Gender=F) • attributes have discrete values – all examples have all attributes. • each example is a point in n-dimensional feature space • maintain a library of previous cases • when a new problem arises – find the most similar case(s) in the library – adapt the similar cases to solving the current problem

Learning Decision Trees • Goal: Build a decision tree for classifying examples as positive

Learning Decision Trees • Goal: Build a decision tree for classifying examples as positive or negative instances of a concept • Supervised – batch processing of training examples – using a preference bias

Induction - decision trees - second example

Induction - decision trees - second example

Induction - decision trees - second example • If there are some positive and

Induction - decision trees - second example • If there are some positive and some negative examples, then choose the best attribute to split them. • If all the remaining examples are positive (or all negative), then we are done: we can answer Yes or No. • If there are no examples left, it means that no such example has been observed, and we return a default value calculated from the majority classification at the node’s parent. • If there are no attributes left, but both positive and negative examples, we have a problem. It means that these examples have exactly the same description, but different classifications. This happens when some of the data are incorrect; we say there is noise in the data. It also happens when the attributes do not give enough information to fully describe the situation, or when the domain is truly nondeterministic.

Induction - decision trees - choice of attributes Information theory Mathematical model for choosing

Induction - decision trees - choice of attributes Information theory Mathematical model for choosing the best attribute and at methods for dealing with noise in the data. The scheme used in decision tree learning for selecting attributes is designed to minimize the depth of the final tree. The idea is to pick the attribute that goes as far as possible toward providing an exact classification of the examples. A perfect attribute divides the examples into sets that are all positive or all negative. The measure should have its maximum value when the attribute is perfect and its minimum value when the attribute is of no use at all.

Induction - third example

Induction - third example

Induction - example Example Height Hair Eyes Class E 1 tall dark blue 1

Induction - example Example Height Hair Eyes Class E 1 tall dark blue 1 E 2 short dark blue 1 E 3 tall blond blue 2 E 4 tall red blue 2 E 5 tall blond brown 1 E 6 short blond blue 2 E 7 short blond brown 1 E 8 tall dark brown 1

Induction - example Example Height Hair Eyes Class E 1 tall dark blue 1

Induction - example Example Height Hair Eyes Class E 1 tall dark blue 1 E 2 short dark blue 1 E 3 tall blond blue 2 E 4 tall red blue 2 E 5 tall blond brown 1 E 6 short blond blue 2 E 7 short blond brown 1 E 8 tall dark brown 1

Induction - example Example Height Hair Eyes Class E 1 tall dark blue 1

Induction - example Example Height Hair Eyes Class E 1 tall dark blue 1 E 2 short dark blue 1 E 3 tall blond blue 2 E 4 tall red blue 2 E 5 tall blond brown 1 E 6 short blond blue 2 E 7 short blond brown 1 E 8 tall dark brown 1

Induction - example Example Height Hair Eyes Class E 1 tall dark blue 1

Induction - example Example Height Hair Eyes Class E 1 tall dark blue 1 E 2 short dark blue 1 E 3 tall blond blue 2 E 4 tall red blue 2 E 5 tall blond brown 1 E 6 short blond blue 2 E 7 short blond brown 1 E 8 tall dark brown 1

Induction - example

Induction - example

Induction - example Example Height Hair Eyes Class E 3 tall blond blue 2

Induction - example Example Height Hair Eyes Class E 3 tall blond blue 2 E 5 tall blond brown 1 E 6 short blond blue 2 E 7 short blond brown 1

Induction - example Example Height Hair Eyes Class E 3 tall blond blue 2

Induction - example Example Height Hair Eyes Class E 3 tall blond blue 2 E 5 tall blond brown 1 E 6 short blond blue 2 E 7 short blond brown 1

hair blond red dark E 4 – class 2 E 3 – class 2

hair blond red dark E 4 – class 2 E 3 – class 2 E 5 – class 1 E 6 – class 2 E 7 – class 1 E 1 – class 1 E 2 – class 1 E 8 – class 1 Eyes blue brown E 3 – class 2 E 6 – class 2 E 5 – class 1 E 7 – class 1

Induction systems Determine objective - a search through a decision tree will reach one

Induction systems Determine objective - a search through a decision tree will reach one of a finite set of decisions on the basis of the path taken through the tree. Determine decision factors - represent decision tree. the attribute nodes of the Determine decision factor values - represent the attribute values of the decision tree. Determine solutions - list of final decisions that the system can make the leaf nodes in the tree. Form example set. Create decision tree. Test the system. Revise the system.

Induction systems - example Football game prediction system Predict the outcome of a football

Induction systems - example Football game prediction system Predict the outcome of a football game (will our team win or lose). Decision factors - location, weather, team record, opponent record. Decision factor values Location Weather Own Record Opponent Record Home Away Rain Cold Moderate Hot Poor Average Good Solutions - win or lose

Induction systems - example (cont’d) Examples -

Induction systems - example (cont’d) Examples -

Induction systems - example (cont’d) Decision tree - rain Weather Loss hot cold home

Induction systems - example (cont’d) Decision tree - rain Weather Loss hot cold home moderate Location away Win Own rec poor good No-data average Loss Win Test the system - predict the future games. Get the values for the decision factors for the upcoming game and see on which team to bet.

Induction systems - example (test)

Induction systems - example (test)

Sensitivity study - Location

Sensitivity study - Location

Induction systems - pros. and cons. Discovers rules from examples - potential unknown rules

Induction systems - pros. and cons. Discovers rules from examples - potential unknown rules could be induced. Avoids knowledge elicitation problems - system knowledge can be acquired through past examples. Can produce new knowledge. Can uncover critical decision factors. Can eliminate irrelevant decision factors. Can uncover contradictions. Difficult to choose good decision factors. Difficult to understand rules. Applicable only for classification problems.

Induction systems - implemented AQ 11 - diagnosing soybean diseases. Identifies 15 different diseases.

Induction systems - implemented AQ 11 - diagnosing soybean diseases. Identifies 15 different diseases. The knowledge was derived from 630 examples and used 35 decision rules. Willard - forecasting thunderstorms. 140 examples, hierarchy of 30 modules, each with a decision tree. Rulemaster - detecting signs of transformer faults. Stock market predictions.