Inductive Logic Programming t k prasadwright edu http

  • Slides: 24
Download presentation
Inductive Logic Programming t. k. prasad@wright. edu http: //www. knoesis. org/tkprasad/ Includes slides by

Inductive Logic Programming t. k. prasad@wright. edu http: //www. knoesis. org/tkprasad/ Includes slides by Luis Tari CS 774 L 16 ILP 1

Logic Programming l Consider the following example of a logic program: parent_of(charles, george). parent_of(george,

Logic Programming l Consider the following example of a logic program: parent_of(charles, george). parent_of(george, diana). parent_of(bob, harry). parent_of(harry, elizabeth). grandparent_of(X, Y) : - parent_of(X, Z), parent_of(Z, Y). • From the program, we can ask queries about grandparents. • Query: grandparent_of(X, Y)? • Answers: • grandparent_of(charles, diana). • grandparent_of(bob, elizabeth). CS 774 L 16 ILP 2

(Machine) Learning l l l CS 774 The process by which relatively permanent changes

(Machine) Learning l l l CS 774 The process by which relatively permanent changes occur in behavioral potential as a result of experience. (Anderson) Learning is constructing or modifying representations of what is being experienced. (Michalski) A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. (Mitchell) L 16 ILP 3

Forms of Reasoning l Deduction: From causes to effect (Prediction) l fact a, rule

Forms of Reasoning l Deduction: From causes to effect (Prediction) l fact a, rule a => b INFER b (*First-order logic*) l Abduction: From effects to possible causes (Explanation) l rule a => b, observe b AN EXPLANATION a l Induction: From correlated observations to rules (Learning) l observe correlation between a 1, b 1, . . . an, bn LEARN a -> b CS 774 L 16 ILP 4

What is ILP? l Inductive Logic Programming (ILP) l l l Automated learning of

What is ILP? l Inductive Logic Programming (ILP) l l l Automated learning of logic rules from examples and background knowledge E. g. , learn the rule for grandparents, given background knowledge of parents and examples of grandparents ILP can be used for classification and prediction CS 774 L 16 ILP 5

Why ILP ? – multiple relations Genealogy example: l Given known relations… l l

Why ILP ? – multiple relations Genealogy example: l Given known relations… l l father(Old, Young) and mother(Old, Young) l male(Somebody) and female(Somebody) …learn new relations l l l parent(X, Y) : - father(X, Y). parent(X, Y) : - mother(X, Y). brother(X, Y) : male(X), father(Z, Y). Most ML techniques cannot use more than one relation e. g. , decision trees, neural networks, … CS 774 L 16 ILP 6

ILP – formal definitions l Given l l a logic program B representing background

ILP – formal definitions l Given l l a logic program B representing background knowledge a set of positive examples E+ a set of negative examples E- Find hypothesis H such that: B U H e for every e E+. 2. B U H f for every f E-. 3. B U H is consistent. 1. Assume that B e for some e E+. CS 774 L 16 ILP 7

ILP – logical foundation l Prolog = Programming with Logic is used to represent:

ILP – logical foundation l Prolog = Programming with Logic is used to represent: l l Background knowledge (of the domain): facts Examples (of the relation to be learned): facts Theories (as a result of learning): rules Supports two forms of logical reasoning l l CS 774 Deduction Induction L 16 ILP 8

Logical reasoning: deduction From rules to facts… B T mother(penelope, victoria). mother(penelope, arthur). father(christopher,

Logical reasoning: deduction From rules to facts… B T mother(penelope, victoria). mother(penelope, arthur). father(christopher, victoria). father(christopher, arthur). |- E parent(penelope, victoria). parent(penelope, arthur). parent(christopher, victoria). parent(christopher, arthur). parent(X, Y) : - father(X, Y). parent(X, Y) : - mother(X, Y). CS 774 L 16 ILP 9

Logical reasoning: induction From facts to rules… B mother(penelope, victoria). mother(penelope, arthur). father(christopher, victoria).

Logical reasoning: induction From facts to rules… B mother(penelope, victoria). mother(penelope, arthur). father(christopher, victoria). father(christopher, arthur). E |- T parent(penelope, victoria). parent(penelope, arthur). parent(christopher, victoria). parent(christopher, arthur). parent(X, Y) : - father(X, Y). parent(X, Y) : - mother(X, Y). CS 774 L 16 ILP 10

Example l Background knowledge B: l l l Positive examples E+: l l l

Example l Background knowledge B: l l l Positive examples E+: l l l parent_of(charles, george). parent_of(george, diana). parent_of(bob, harry). parent_of(harry, elizabeth). grandparent_of(charles, diana). grandparent_of(bob, elizabeth). Generate hypothesis H: l CS 774 grandparent_of(X, Y) : - parent_of(X, Z), parent_of(Z, Y). L 16 ILP 11

Example: Same Generation L 16 ILP CS 774 12

Example: Same Generation L 16 ILP CS 774 12

Why ILP ? - Structured data Seed example of East-West trains (Michalski) CS 774

Why ILP ? - Structured data Seed example of East-West trains (Michalski) CS 774 What makes a train to go eastward ? L 16 ILP 13

Why ILP ? – multiple relations This is related to structured data has_car car_properties

Why ILP ? – multiple relations This is related to structured data has_car car_properties Train Car Length Shape t 1 c 11 short rectangle 2 none … t 1 c 12 long rectangle 3 none … t 1 c 13 short rectangle 2 peaked … t 1 c 14 long rectangle 2 none … t 2 c 21 short rectangle 2 flat … … … … CS 774 L 16 ILP Axes Roof … … 14

Induction of a classifier: example Example of East-West trains l B: relations has_car and

Induction of a classifier: example Example of East-West trains l B: relations has_car and car_properties l l (length, roof, shape, etc. ) e. g. , has_car(t 1, c 11) E: the trains t 1 to t 10 C: east, west • Possible T: east(T) : has_car(T, C), length(C, short), roof(C, _). CS 774 L 16 ILP 15

ILP systems l Two of the most popular ILP systems: l l l Progol

ILP systems l Two of the most popular ILP systems: l l l Progol [Muggleton 95] l l l Developed by S. Muggleton et. al. Learns first-order Horn clauses (no negation in head and body literals of hypotheses) FOIL [Quinlan 93] l l CS 774 Progol FOIL Developed by J. Quinlan et. al. Learns first-order rules (no negation in head literals of the hypotheses) L 16 ILP 16

Rule Learning (Intuition) l How to come up with a rule for grandparent_of(X, Y)?

Rule Learning (Intuition) l How to come up with a rule for grandparent_of(X, Y)? 1. 2. 3. Take the example grandparent_of(bob, elizabeth). Find the subset of background knowledge relevant to this example: parent_of(bob, harry), parent_of(harry, elizabeth). Form a rule from these facts grandparent_of(bob, elizabeth) : parent_of(bob, harry), parent_of(harry, elizabeth). 4. Generalize the rule grandparent_of(X, Y) : - parent_of(X, Z), parent_of(Z, Y). 5. CS 774 Check if this rule is valid w. r. t the positive and the negative examples L 16 ILP 17

Top-down induction of logic programs l l l Employs refinement operators Typical refinement operators

Top-down induction of logic programs l l l Employs refinement operators Typical refinement operators on a clause: l Apply a substitution to clause l Add a literal to the body of clause Refinement graph: l Nodes correspond to clauses l Arcs correspond to refinements CS 774 L 16 ILP 18

Part of refinement graph has_a_daughter(X) : male(Y). has_a_daughter(X) : female(Y). has_a_daughter(X) : male(X). has_a_daughter(X)

Part of refinement graph has_a_daughter(X) : male(Y). has_a_daughter(X) : female(Y). has_a_daughter(X) : male(X). has_a_daughter(X) : female(Y), parent(S, T). . CS 774 . . . L 16 ILP has_a_daughter(X) : parent(Y, Z). has_a_daughter(X) : parent(X, Z), female(U). has_a_daughter(X) : parent(X, Z), female(Z). 19

Progol Algorithm Outline 1. 2. 3. 4. CS 774 From a subset of positive

Progol Algorithm Outline 1. 2. 3. 4. CS 774 From a subset of positive examples, construct the most specific rule rs. Based on rs, find a generalized form rg of rs so that score(rg) has the highest value among all candidates. Remove all positive examples that are covered by rg. Go to step 1 if there are still positive examples that are not yet covered. L 16 ILP 20

Scoring hypotheses l score(r) is a measure of how well a rule r explains

Scoring hypotheses l score(r) is a measure of how well a rule r explains all the examples with preference given to shorter rules. l l CS 774 pr = number of +ve examples correctly deducible from r nr = number of -ve examples correctly deducible from r cr = number of body literals in rule r score(r) = pr – (nr + cr) L 16 ILP 21

Applications of ILP l l l Constructing Biological Knowledge Bases by Extracting Information from

Applications of ILP l l l Constructing Biological Knowledge Bases by Extracting Information from Text Sources (M. Craven & J. Kumlien) [Craven 99] The automatic discovery of structural principles describing protein fold space (A. Cootes, S. H. Muggleton, and M. J. E. Sternberg) [Cootes 03] More from UT-ML group (Ray Mooney) l CS 774 http: //www. cs. utexas. edu/~ml/publication/ilp. html L 16 ILP 22

Example of relation from text l Sample sentence from biomedical articles: l We want

Example of relation from text l Sample sentence from biomedical articles: l We want to extract the following relation: CS 774 L 16 ILP 23

References l l [Quinlan 93] J. R. Quinlan, R. M. Cameron-Jones. FOIL: A Midterm

References l l [Quinlan 93] J. R. Quinlan, R. M. Cameron-Jones. FOIL: A Midterm Report. Proceedings of Machine Learning: ECML-93 [Muggleton 95] S. Muggleton. Inverse Entailment and Progol. New Generation Computing Journal, 13: 245286, 1995. [Craven 99] M. Craven & J. Kumlien (1999). Constructing Biological Knowledge Bases by Extracting Information from Text Sources. ISMB 99. [Cootes 03] A. Cootes, S. H. Muggleton, and M. J. E. Sternberg. The automatic discovery of structural principles describing protein fold space. Journal of Molecular Biology, 330(4): 839 -850, 2003. CS 774 L 16 ILP 24