Uncertainty in Artificial Intelligence Research at USC Research

Uncertainty in Artificial Intelligence • Artificial Intelligence (AI) – [Robotics] – Automated Reasoning •

Research Interests • Algorithms for Probability Update in BNs – factor tree method, with

Algorithms and Modeling • Algorithms for probability update in BNs – factor tree method,

Correlation vs. Causation • The genotype theory (Fisher, 1958) of smoking and lung cancer:

An Example [Cochran through Pearl, 2000] Soil fumigants (X) are used to increase oat

Nonidentifiability • The identifiablility of the effect of X on Y ensures that it

Smoking and the genotype theory • Consider the relation between smoking(X) and lung cancer(Y).

Learning • Parallel learning with background knowledge, with Bhaskara Moole • CB algorithm, with

An Example of Learning: Chernobyl UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and

A Bayesian Network Model UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Simulation UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Simulation File Conversion UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Sample(s) Key Yes: 1 Read: 1 Received: 1 Heard: 1 Received: 1 No: 2

Visual CB • CB [Singh and Valtorta, 1993; 1995] – in Visual C++ Bing

Learning UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Result on Chernobyl Example UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Results II UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Results III UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Results IV UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Applications • Assessment of the risk of mental retardation in infants, with Subramani Mani

MENTOR UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

The Omni. Seer Project • Represent prior knowledge to support intelligence analysis • Explicate

The massive data might be filtered by preferences and interests specified in the UConn

Competence and Resources • Several faculty members in the CSE department have worked in

Some Local UAI Researchers (Notably Missing: Juan Vargas) Billy Turkett, Ph. D. (Wake Forest)

Judea Pearl and Finn V. Jensen UNIVERSITY OF SOUTH CAROLINA Department of Computer Science

Additional Information • Bayesian networks journal club – meets every two weeks on Wednesdays:

Slides: 28

Download presentation

Uncertainty in Artificial Intelligence Research at USC: Research Presentation for Graduate Students September 10, 2004 Marco Valtorta SWRG 3 A 55 mgv@cse. sc. edu UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Uncertainty in Artificial Intelligence • Artificial Intelligence (AI) – [Robotics] – Automated Reasoning • [Theorem Proving, Search, etc. ] • Reasoning Under Uncertainty – [Fuzzy Logic, Possibility Theory, etc. ] – Normative Systems » Bayesian Networks » Influence Diagrams (Decision Networks) UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Research Interests • Algorithms for Probability Update in BNs – factor tree method, with Mark Bloemeke • Modeling of uncertain evidence – observation variables, with Young-Gyun Kim and Jirka Vomlel • Soft Evidential Update in BNs – and the big clique algorithm, with Young-Gyun Kim and Jirka Vomlel • Causal Bayesian networks • Learning – CB algorithm, with Moninder Singh and Bing Xia – the effect of data quality on learning, with Valerie Sessions UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Algorithms and Modeling • Algorithms for probability update in BNs – factor tree method, with Mark Bloemeke • Modeling of uncertain evidence with observation variables, with Young-Gyun Kim and Jirka Vomlel • Soft evidential update in BNs and the big clique algorithm, with Young-Gyun Kim and Jirka Vomlel • Causal Bayesian networks, with Yimin Huang UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Correlation vs. Causation • The genotype theory (Fisher, 1958) of smoking and lung cancer: smoking and lung cancer are both effects of a genetic predisposition • Three node network • X( smoking) and Y( lung cancer) are in lockstep X • X precedes Y in time (smoke before cancer) • But, X does not cause Y, because if we set X, Y does not change: Y only changes according to the value of U (the genotype) UNIVERSITY OF SOUTH CAROLINA U Y Causality: Models, Reasoning and Inference Chapter 3 Science and Engineering Department of Computer

An Example [Cochran through Pearl, 2000] Soil fumigants (X) are used to increase oat crop yields (Y) by controlling the eelworm population (Z). Last year’s eelworm population (Z 0) is an unknown quantity that is strongly correlated with this year’s population. Through laboratory analysis of soil samples, we can determine the eelworm populations before and after the treatments (Z 1 and Z 2). Furthermore , we assume that the fumigants do not affect the growth of eelworms surviving the treatment. Instead, eelworm’s growth depends on the population of birds (B), which is correlated with last year’s eelworm population and hence with the treatment itself. Z 3 here represents the eelworm population at the end of the season. We wish to assess the total effect of the fumigants on yields. But, controlled randomized experiment are unfeasible and Z 0 is unknown. If we got a correct model, can we obtain consistent estimate of the target quantity – the total effect of the fumigants on yields – through observations? UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Nonidentifiability • The identifiablility of the effect of X on Y ensures that it is possible to infer the effect of action do(X=x) on Y from passive observations and the causal graph G, which specifies which variables participate in the determination of each variable in the domain • To prove nonidentifiability, it is sufficient to present two sets of structural equations that induce identical distributions over observed variables but have X different causal effects • X and Y are observable, U is not. All of them are binary variables • Let P(X=0|U) = (0. 5, 0. 5) Y=0 X =0 U =0 0. 1 • P(Y=0|X, U) is given by the table on the right • We cannot observe U, so we do not know P(U) U=1 0. 8 • When P(U=0) = 0. 5, P(Y|X=0) =(. 45, . 55) • When P(U=0) = 0. 1, P(Y|X=0) =(. 73, . 27) • So, P(Y|do(X)) is non-identifiable UNIVERSITY OF SOUTH CAROLINA U Y X= 1 0. 2 0. 7 Causality: Models, Reasoning and Inference Chapter 3 Science and Engineering Department of Computer

Smoking and the genotype theory • Consider the relation between smoking(X) and lung cancer(Y). • The tobacco industry has managed to forestall antismoking legislation by arguing that observed correlation between smoking and lung cancer could be explained by some sort of carcinogenic genotype(U) that involves inborn carving for nicotine • Suppose that Z is the amount of tar deposited in a person's lungs and we believe in the causal model shown on the right. • Can we now recover from observational data only? UNIVERSITY OF SOUTH CAROLINA Causality: Models, Reasoning and Inference Chapter 3 Science and Engineering Department of Computer

Learning • Parallel learning with background knowledge, with Bhaskara Moole • CB algorithm, with Moninder Singh and Bing Xia • Effect of data quality on learning, with Valerie Sessions UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

An Example of Learning: Chernobyl UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

A Bayesian Network Model UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Simulation UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Simulation File Conversion UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Sample(s) Key Yes: 1 Read: 1 Received: 1 Heard: 1 Received: 1 No: 2 UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Visual CB • CB [Singh and Valtorta, 1993; 1995] – in Visual C++ Bing Xia, MS, 2002 UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Learning UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Result on Chernobyl Example UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Results II UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Results III UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Results IV UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Applications • Assessment of the risk of mental retardation in infants, with Subramani Mani and Suzanne Mc. Dermott • Agent-based intrusion detection with soft evidence, with Vaibhav Gowadia and Csilla Farkas • Support for intelligence analysis, with Michael Huhns, Hrishi Goradia, Jiangbo Dang, and Jingshan Huang • Modeling damage in critical resources, with Yimin Huang and Bill Full UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

MENTOR UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

The Omni. Seer Project • Represent prior knowledge to support intelligence analysis • Explicate formerly tacit knowledge for use and collaboration • Support relevance analysis, evidence gathering, and novelty detection • …with Bayesian networks! UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

The massive data might be filtered by preferences and interests specified in the UConn User Model Omni. Seer Functional Architecture <Date>2002 -09 -20</Date> <Person>John Doe</Person> <Place>London</Place> Events Messages Tasks Documents Bayesian networks … <Date>2002 -09 -27</Date> <Person>John Doe</Person> Outdated fragments are removed periodically from the set of partially Tacitinstantiated Knowledgefragments Evidence Outdated fragments Analyst Massive Data BN BN fragments represent an an Differences between Matcher … Fragments analyst’s prior knowledge about analyst’s conclusion and the Forgetter Modified Text terrorist activities or other situation-specific scenario lead domains of interest specified in tacit The noun-phrase analyzer to explication of formerly The analyst explores which Bayesian Instantiated Fragments the UConn user model as new from UConn processes knowledge, represented information should be acquired Reasoning Service messages; a 3 rd-party tagger BN to reduce uncertainty and The fragments analyst is notified processes news feeds assesses the robustness of of surprises and conclusions interesting situations, as Relevant facts extracted from Value of Information specified in the UConn the documents and messages User Model Explanation fill in the details of the BN Situation Specific Analysis Sensitivity Analyzer fragments of interest Scenarios Composer Al ert s Tagged messages Instantiated BN fragments are composed into scenarios specific to Surprise Detector the situation at hand UNIVERSITY OF SOUTH CAROLINA Visualization Explanation Analysis Department of Computer Science and Engineering

Competence and Resources • Several faculty members in the CSE department have worked in normative probabilistic reasoning for many years • Some colleagues and students in the Statistics department are also interested • Tools for editing BNs and IDs, propagation, interface with relational databases, soft evidential update, learning, etc. , have been acquired or developed and used in projects and courses (CSCE 582 and CSCE 822) UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Some Local UAI Researchers (Notably Missing: Juan Vargas) Billy Turkett, Ph. D. (Wake Forest) Young-Gyun Kim, Ph. D. (S. C. State) Wayne Smith, Ph. D. (Presyterian College) Miguel Barrientos, Ph. D. UNIVERSITY OF SOUTH CAROLINA Clif Presser, Ph. D. (Gettysburg College) Department of Computer Science and Engineering

Judea Pearl and Finn V. Jensen UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering

Additional Information • Bayesian networks journal club – meets every two weeks on Wednesdays: next meeting on September 15 at 1 pm in 3 A 75 – http: //www. cse. sc. edu/~mgv/BNSeminar/index. html • 3 A 55, 777 -4641 • mgv@cse. sc. edu • www. cse. sc. edu/~mgv UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering