Graphical Causal Models Determining Causes from Observations William
Graphical Causal Models: Determining Causes from Observations William Marsh Risk Assessment and Decision Analysis (RADAR) Computer Science
RADAR Group, Computer Science n n Risk Assessment and Decision Analysis Research areas n n Software engineering, safety, finance, legal A new initiative in medical data analysis: DIADEM Norman Fenton Group leader Martin Neil http: //www. dcs. qmul. ac. uk/researchgp/radar/
Outline n Graphical Causal Models n n Bayesian networks: prediction or diagnosis Causal induction: learning causes from data Causal effect estimation: strength of causal relationships from data DIADEM project
Bayesian Nets
Detecting Asthma Exacerbations n Aim to assist early detection of asthma episodes in Paediatric A&E n n Using only data already available electronically Network created by n n Experts Data
Bayes’ Theorem Joint probability Revised belief about A, given evidence B Prior probability of A Factor to update belief about A, given evidence B
Bayes’ Theorem (Made Easy) yes, no Infection rate: P(I) = 1% Infection False positive P(T=pos|I=no) = 5% Negligible false negative pos, neg Test n n n A person has a positive test result How likely is it they are infected? 17%
Medical Uses of BNs n Diagnosis n n Prediction n n Differential diagnosis from symptoms Likely outcome Building a BN n n From expert knowledge expert system From data mining
Beyond Bayesian Networks
Cause versus Association Infection Fever ? or Fever n n Joint probability same: Infection Both represent fever infection association ‘Causal model’ has arrow from cause to effect
Causal Induction n Discover causal relationships from data Sometimes distinguishable A B C … different conditional independence
Causal Induction – Application n Discover causal relationships from data n n Need lots of data Applied to gene regulatory networks n n Data from micro-array experiments Recent explanation of limitations
Estimating Causal Effects n Suppose A is a cause of B A n B What is the causal effect? n Is it p(B | A) ?
Benefits of Sports? intelligence sport n Is there a relationship between sport and exam success? n n n exam result Data available ‘Intelligence’ correlate Is this the correct test? P(exam=pass|sport) > P(exam=pass| no-sport)
Benefits of Sports? intelligence observe sport exam result p(pass|sport) > p(pass| no-sport) 73% n When we condition on ‘sport’ n n n 67% Probability for ‘exam result’ Probability for ‘intelligence’ changes What if I decide to start sport?
Intervention v Observation intelligence change n sport exam result Causal effect differs from conditional probability P(pass|do(sport)) < P(pass| do(no sport)) n Mostly interested in consequence of change n n Causal effects can be measured by a Randomised Control Trial Causal effect of sport on exam results not identifiable
Benefit of Sport intelligence sport (S) n n attendance (A) exam result (E) New observable variable ‘attendance at lectures’ Causal effect of sport on exam results now identifiable
Estimating Causal Effects n Rules to convert causal to statistical questions n n Causal model n n n Generalises e. g. stratification, potential outcomes Assumptions: a causal model Some assumptions may be testable Some variables observed, others not measured Some causal effects identifiable Challenges n n Causal models for complex applications Statistical implications
Example Application n Royal London trauma service n n Criteria for activation of the trauma team Aim to prevent unnecessary trauma team calls Extensive records of trauma patient outcomes US study of 1495 admissions proposed new ‘triage’ criteria n n Significant decrease in overtriage 51% 29% Insignificant increase in undertriage 1% 3% None of the patients undertriaged by new criteria died Does this show safety of new criteria?
DIADEM Project
Digital Economy in Healthcare n n n Data Information and Analysis for clinical DEcision Making EPSRC Digital Economy Cluster n n Partnership between solution providers and clinical data analysis problem holders Summarise unsolved data analysis needs, in relation to the analysis techniques available Join the DIADEM cluster
Cluster Activities and Outcomes n Engage stakeholders and build a community: n n A road map: data and information n n Creation of a community web-site and forum Meetings with potential ‘problem holders’ Workshops Follow-up proposal A self-sustaining website – health data analytics
Summary n Bayesian networks n n Causal induction n n Prediction and diagnosis Join the DIADEM cluster Identify (some) causal relationships from (lots of) data Causal effects n n n Experimental results from … … non-experimental data … assumptions (causal model)
- Slides: 23