Taming Uncertainty in Forensic DNA Evidence ENFSI Meeting

  • Slides: 38
Download presentation
Taming Uncertainty in Forensic DNA Evidence ENFSI Meeting April, 2011 Brussels, Belgium Mark W

Taming Uncertainty in Forensic DNA Evidence ENFSI Meeting April, 2011 Brussels, Belgium Mark W Perlin, Ph. D, MD, Ph. D Cybergenetics, Pittsburgh, PA Cybergenetics © 2003 -2011

Uncertainty • people (and the law) want certainty • scientific data are uncertain •

Uncertainty • people (and the law) want certainty • scientific data are uncertain • probability describes uncertainty • information is a change in probability • DNA identification is an information science • goal is maximal objective information

Genotype ? unknown person 10, 10 10, 11 10, 12 10, 13 11, 11

Genotype ? unknown person 10, 10 10, 11 10, 12 10, 13 11, 11 11, 12 12, 13 13, 13 … Prior probability population

Genotype ? mother biological father 10, 10 10, 11 10, 12 10, 13 child

Genotype ? mother biological father 10, 10 10, 11 10, 12 10, 13 child Likelihood function

Genotype ? mother biological father 10, 10 10, 11 10, 12 10, 13 child

Genotype ? mother biological father 10, 10 10, 11 10, 12 10, 13 child Posterior probability Objective Bayesian inference posterior = prior x likelihood

Match ? mother biological father child 10, 10 10, 11 10, 12 10, 13

Match ? mother biological father child 10, 10 10, 11 10, 12 10, 13 50% Pr(matching genotype) 10, 11 putative father

Match ? mother biological father 10, 10 10, 11 10, 12 10, 13 child

Match ? mother biological father 10, 10 10, 11 10, 12 10, 13 child population 50% Pr(matching genotype) 10, 11 10, 10 10, 11 10, 12 10, 13 11, 11 11, 12 12, 13 13, 13 … putative father Pr(cooincidental genotype) 5%

MW Perlin. Explaining the likelihood ratio in DNA mixture interpretation. Proceedings of Promega's 21

MW Perlin. Explaining the likelihood ratio in DNA mixture interpretation. Proceedings of Promega's 21 st International Symposium on Human Identification, 2010. Information LR = Pr(matching genotype) Pr(cooincidental genotype) Likelihood ratio 50% = 10 = 5% log(LR) is a standard measure of information log 10(10) = 1 information unit

Kinship father ? spouse son missing person 10, 10 10, 11 10, 12 10,

Kinship father ? spouse son missing person 10, 10 10, 11 10, 12 10, 13 daughter population 90% Pr(matching genotype) 10, 11 10, 10 10, 11 10, 12 10, 13 11, 11 11, 12 12, 13 13, 13 … biological remains Pr(cooincidental genotype) 5%

Phenotype evidence + reality

Phenotype evidence + reality

Phenotype evidence Lab STR data + 10 11 12 reality observation

Phenotype evidence Lab STR data + 10 11 12 reality observation

Phenotype evidence Lab STR data Infer genotype 10, 10 @ 30% 10, 11 @

Phenotype evidence Lab STR data Infer genotype 10, 10 @ 30% 10, 11 @ 50% 10, 12 @ 20% + 10 11 12 reality observation model

Phenotype ? evidence 10, 10 10, 11 10, 12 10, 13 50% Pr(matching genotype)

Phenotype ? evidence 10, 10 10, 11 10, 12 10, 13 50% Pr(matching genotype) 10 11 12 population 10, 11 10, 10 10, 11 10, 12 10, 13 11, 11 11, 12 12, 13 13, 13 … suspect Pr(cooincidental genotype) 5%

Computer Interpretation • quantitative computer interpretation • statistical search of probability model • preserve

Computer Interpretation • quantitative computer interpretation • statistical search of probability model • preserve all identification information • objectively infer genotype, then match • any number of mixture contributors • stutter, imbalance, degraded DNA • calculate uncertainty of every peak

Quantitative Data

Quantitative Data

Perlin MW, Szabady B. Linear mixture analysis: a mathematical approach to resolving mixed DNA

Perlin MW, Szabady B. Linear mixture analysis: a mathematical approach to resolving mixed DNA samples. Journal of Forensic Sciences, 2001. Quantitative Interpretation

MW Perlin, A Sinelnikov. An information gap in DNA evidence interpretation. PLo. S ONE,

MW Perlin, A Sinelnikov. An information gap in DNA evidence interpretation. PLo. S ONE, 2009. Calculate Peak Uncertainty

MW Perlin, MM Legler, CE Spencer, JL Smith, WP Allan, JL Belrose, BW Duceman.

MW Perlin, MM Legler, CE Spencer, JL Smith, WP Allan, JL Belrose, BW Duceman. Validating True. Allele DNA mixture interpretation. Journal of Forensic Sciences, 2011. Infer Accurate Genotype

Qualitative Thresholds Over threshold, peaks are treated as allele events. list of included alleles

Qualitative Thresholds Over threshold, peaks are treated as allele events. list of included alleles Under threshold, alleles do not exist.

Qualitative Method Variation National Institute of Standards and Technology Two Contributor Mixture Data, Known

Qualitative Method Variation National Institute of Standards and Technology Two Contributor Mixture Data, Known Victim 213 trillion (14) 31 thousand (4)

The DNA Investigator™ Newsletter, 2009 Same Data, More Information – Murder, Match and DNA

The DNA Investigator™ Newsletter, 2009 Same Data, More Information – Murder, Match and DNA Fingernail: 7% Mixture Commonwealth v. Foley Score 13 thousand 23 million 189 billion Method inclusion use victim quantitative • probability modeling preserves information • peak threshold discards information

More Data, More Information v. WA locus data • low template mixture • three

More Data, More Information v. WA locus data • low template mixture • three DNA contributors • triplicate amplification • post-PCR enhancement • no match score found • computer interpretation • joint likelihood function

Genoype probabilility (v. WA) Information 6 x Assume theta = 1%, and three contributors.

Genoype probabilility (v. WA) Information 6 x Assume theta = 1%, and three contributors. A match between the suspect and the evidence is 3, 620, 000 times more probable than coincidence.

Perlin MW, Greenhalgh M. Scientific combination of DNA evidence: a handgun mixture in eight

Perlin MW, Greenhalgh M. Scientific combination of DNA evidence: a handgun mixture in eight parts. Twentieth International Symposium on the Forensic Sciences of the Australian and New Zealand Forensic Science Society, Sydney, Australia. 2010. More Data, More Information 6 5 4 3

Data: Locus D 18 3 4 5 6

Data: Locus D 18 3 4 5 6

1 2 4 8

1 2 4 8

MW Perlin, A Sinelnikov. An information gap in DNA evidence interpretation. PLo. S ONE,

MW Perlin, A Sinelnikov. An information gap in DNA evidence interpretation. PLo. S ONE, 2009. Preserve vs. Discard

MW Perlin, MM Legler, CE Spencer, JL Smith, WP Allan, JL Belrose, BW Duceman.

MW Perlin, MM Legler, CE Spencer, JL Smith, WP Allan, JL Belrose, BW Duceman. Validating True. Allele DNA mixture interpretation. Journal of Forensic Sciences, 2011. Preserve vs. Discard Quantitative Interpretation 6. 24 13. 26 7. 03 Inclusion method

MW Perlin, MM Legler, CE Spencer, JL Smith, WP Allan, JL Belrose, BW Duceman.

MW Perlin, MM Legler, CE Spencer, JL Smith, WP Allan, JL Belrose, BW Duceman. Validating True. Allele DNA mixture interpretation. Journal of Forensic Sciences, 2011. Validated Reproducibility 0. 175

Perlin MW, Duceman BW. Profiles in productivity: greater yield at lower cost with computer

Perlin MW, Duceman BW. Profiles in productivity: greater yield at lower cost with computer DNA interpretation. Twentieth International Symposium on the Forensic Sciences of the Australian and New Zealand Forensic Science Society, Sydney, Australia. 2010. Preserve vs. Discard • quantitative interpretation preserves information - every time • peak threshold discards information - 70% of the time

Data Peak Height is a Random Variable with Mean and Variance

Data Peak Height is a Random Variable with Mean and Variance

Accurate vs. Dispersed Genotype Probability

Accurate vs. Dispersed Genotype Probability

Perlin MW. Reliable interpretation of stochastic DNA evidence. Canadian Society of Forensic Sciences 57

Perlin MW. Reliable interpretation of stochastic DNA evidence. Canadian Society of Forensic Sciences 57 th Annual Meeting; Toronto, ON. 2010. Missed Allele Error > 100%

MW Perlin, JB Kadane, RW Cotton. Match likelihood ratio for uncertain genotypes. Law, Probability

MW Perlin, JB Kadane, RW Cotton. Match likelihood ratio for uncertain genotypes. Law, Probability and Risk, 2009. Investigative DNA Database 10, 10 10, 11 10, 12 10, 13 evidence genotypes LR 10, 10 10, 11 10, 12 10, 13 suspect genotypes • genotype probability representation (not alleles) • fully preserves DNA identification information • enables LR calculation with every match • connect crimes to criminals • disaster victim identification (WTC) • find missing people • automatic familial search • combat terrorism through DNA

The DNA Investigator™ Newsletter, 2011 DNA Intelligence and Forensic Failure – What you don't

The DNA Investigator™ Newsletter, 2011 DNA Intelligence and Forensic Failure – What you don't know can kill you Societal Consequences • much informative DNA evidence goes unused • lost in "interim" qualitative interpretation methods, thresholds applied for analyst convenience • DNA databases (e. g. , CODIS) lose information, using inclusion-based "alleles", instead of ISFG's preferred LR-based "genotype" probability Inaccurate, uninformative DNA interpretation: • DNA analysts lose genotypes • prosecutors lose criminal cases • databases lose investigative leads • innocent law abiding citizens lose lives

M Perlin; P Gill, J Buckleton, B Budowle, A van Daal. Low template DNA

M Perlin; P Gill, J Buckleton, B Budowle, A van Daal. Low template DNA controversy. Twentieth International Symposium on the Forensic Sciences of the Australian and New Zealand Forensic Science Society, Sydney, Australia. 2010. International Consensus 1. DNA data is continuous, and has random variation 2. Thresholds do not work for low template DNA 3. Mathematical models can account for random variation 4. The 21 st century might be a good time to move away from potentially biased human review of low level (or almost any) DNA data to some sort of objective computer interpretation that can infer genotypes up to probability, without ever looking at suspects, that gives some (possibly uninformative) objective answer.

Strengthening Forensic Science • objective, reliable, consistent interpretation • preserve all available information, from

Strengthening Forensic Science • objective, reliable, consistent interpretation • preserve all available information, from the scene of crime to court • probabilistic genotypes • modern statistics and computation • solid mathematical foundation • platinum standard for preserving DNA information • move beyond less informative interim methods http: //www. cybgen. com/information perlin@cybgen. com

Learning More The science of quantitative DNA mixture interpretation www. cybgen. com/information • Newsletters

Learning More The science of quantitative DNA mixture interpretation www. cybgen. com/information • Newsletters gentle introduction to ideas • Courses for scientists and lawyers • Presentations handouts, movies, transcripts • Publications abstracts, manuscripts