DNA Mixture Statistics Mark W Perlin Ph D

  • Slides: 78
Download presentation
DNA Mixture Statistics Mark W Perlin, Ph. D, MD, Ph. D Cybergenetics, Pittsburgh, PA

DNA Mixture Statistics Mark W Perlin, Ph. D, MD, Ph. D Cybergenetics, Pittsburgh, PA 2013 Spring Institute Commonwealth's Attorney's Services Council Richmond, Virginia March, 2013 Cybergenetics © 2003 -2013

Child molestation case • June, 2011: Northern Virginia • daughter's birthday slumber party •

Child molestation case • June, 2011: Northern Virginia • daughter's birthday slumber party • 10 year old girls sleeping in basement • object sexual penetration • aggravated sexual battery (2 counts) Prosecutor: CDCA Nicole Wittmann

Bathroom Storage Television Cabinet Laundry Bedroom Table CHILD & CHILD VICTIM underpants Bedroom L-Shaped

Bathroom Storage Television Cabinet Laundry Bedroom Table CHILD & CHILD VICTIM underpants Bedroom L-Shaped Couch Table VICTIM pajama pants Stairs Bookcase Closet Door to Outside

DNA mixture statistics Human review (using thresholds) • underpants original = 10 million modified

DNA mixture statistics Human review (using thresholds) • underpants original = 10 million modified = 1 million • pajama pants original = 2 million modified = 4 Computer interpretation requested

Prosecutor question What is the true match information of the evidence to the suspect?

Prosecutor question What is the true match information of the evidence to the suspect?

Biology 1 trillion cells

Biology 1 trillion cells

Nucleus cell nucleus

Nucleus cell nucleus

DNA cell nucleus chromosomes

DNA cell nucleus chromosomes

Locus cell chromosomes nucleus locus

Locus cell chromosomes nucleus locus

Allele cell chromosomes nucleus alleles Short Tandem Repeat (STR) locus

Allele cell chromosomes nucleus alleles Short Tandem Repeat (STR) locus

Genotype cell chromosomes nucleus alleles locus Short Tandem Repeat (STR) genotype 7, 8

Genotype cell chromosomes nucleus alleles locus Short Tandem Repeat (STR) genotype 7, 8

Identification Evidence item

Identification Evidence item

Identification Evidence item Lab Evidence data 10 12

Identification Evidence item Lab Evidence data 10 12

Identification Evidence item Lab Evidence data Infer Evidence genotype 10, 12 10 12

Identification Evidence item Lab Evidence data Infer Evidence genotype 10, 12 10 12

Identification Evidence item Lab Evidence data Infer Evidence genotype 10, 12 10 12 Compare

Identification Evidence item Lab Evidence data Infer Evidence genotype 10, 12 10 12 Compare Known genotype 10, 12

Identification Evidence item Lab Evidence data Infer Evidence genotype 10, 12 10 12 Compare

Identification Evidence item Lab Evidence data Infer Evidence genotype 10, 12 10 12 Compare Probability(identification) = Prob(suspect matches evidence) = 100% Known genotype 10, 12

Coincidence Biological population

Coincidence Biological population

Coincidence Biological population Lab 10 11 Allele frequency data 12 13 14 15 16

Coincidence Biological population Lab 10 11 Allele frequency data 12 13 14 15 16 17

Coincidence Biological population Lab 10 11 Allele frequency data Infer 12 13 14 15

Coincidence Biological population Lab 10 11 Allele frequency data Infer 12 13 14 15 16 17 Population genotype 10, 12 @ 5% Genotype product rule, combines alleles Prob(10, 12) = 2 x p 10 x p 12 Prob(10, 10) = p 10 x p 10

Coincidence Biological population Lab 10 11 Allele frequency data Infer 12 13 14 15

Coincidence Biological population Lab 10 11 Allele frequency data Infer 12 13 14 15 16 17 Population genotype 10, 12 @ 5% Compare Known genotype 10, 12

Coincidence Biological population Lab 10 11 Allele frequency data Infer 12 13 14 15

Coincidence Biological population Lab 10 11 Allele frequency data Infer 12 13 14 15 16 17 Probability(coincidence) = Prob(coincidental match) = 5% Population genotype 10, 12 @ 5% Compare Known genotype 10, 12

Identification information At the suspect's genotype, identification vs. coincidence? after (evidence) data before (population)

Identification information At the suspect's genotype, identification vs. coincidence? after (evidence) data before (population) Prob(identification) Prob(coincidence) Evidence changes our belief

Perlin MW. Explaining the likelihood ratio in DNA mixture interpretation. Promega's Twenty First International

Perlin MW. Explaining the likelihood ratio in DNA mixture interpretation. Promega's Twenty First International Symposium on Human Identification, 2010; San Antonio, TX. Match statistic At the suspect's genotype, identification vs. coincidence? after (evidence) data before (population) Prob(evidence matches suspect) Prob(coincidental match)

Match statistic At the suspect's genotype, identification vs. coincidence? after (evidence) data before (population)

Match statistic At the suspect's genotype, identification vs. coincidence? after (evidence) data before (population) Prob(evidence matches suspect) Prob(coincidental match) 100% = 5% = 20

Bayes theorem Calculate probability Belief in hypothesis after having seen data Computers, 1985 is

Bayes theorem Calculate probability Belief in hypothesis after having seen data Computers, 1985 is proportional to how well hypothesis explains the data times our initial belief. Rev Bayes, 1763 All hypotheses must be considered. Need computers to do this properly. Hypothesis: Defendant contributed to DNA evidence

Mixture interpretation varies National Institute of Standards and Technology Two Contributor Mixture Data, Known

Mixture interpretation varies National Institute of Standards and Technology Two Contributor Mixture Data, Known Victim 213 trillion (14) 31 thousand (4)

DNA mixture Evidence item +

DNA mixture Evidence item +

Uncertainty Evidence item Lab Evidence data + 10 11 12

Uncertainty Evidence item Lab Evidence data + 10 11 12

Uncertainty Evidence item Lab Evidence data Infer Evidence genotype 10, 12 @ 50% 11,

Uncertainty Evidence item Lab Evidence data Infer Evidence genotype 10, 12 @ 50% 11, 12 @ 30% 12, 12 @ 20% + 10 11 12

Uncertainty Evidence item Lab Evidence data Infer Evidence genotype 10, 12 @ 50% 11,

Uncertainty Evidence item Lab Evidence data Infer Evidence genotype 10, 12 @ 50% 11, 12 @ 30% 12, 12 @ 20% + 10 11 12 Compare Known genotype 10, 12

Uncertainty Evidence item Lab Evidence data Infer Evidence genotype 10, 12 @ 50% 11,

Uncertainty Evidence item Lab Evidence data Infer Evidence genotype 10, 12 @ 50% 11, 12 @ 30% 12, 12 @ 20% + 10 11 12 Compare Probability(identification) = Prob(suspect matches evidence) = 50% Known genotype 10, 12

Identification information At the suspect's genotype, identification vs. coincidence? after (evidence) data before (population)

Identification information At the suspect's genotype, identification vs. coincidence? after (evidence) data before (population) Prob(identification) Prob(coincidence) Numerator decreases Denominator unchanged Less weight of evidence, less change in our belief

Match statistic At the suspect's genotype, identification vs. coincidence? after (evidence) data before (population)

Match statistic At the suspect's genotype, identification vs. coincidence? after (evidence) data before (population) Prob(evidence matches suspect) Prob(coincidental match) = 50% 5% = 10

True. Allele operator STR evidence data. fsa genetic analyzer files • Replicate computer runs

True. Allele operator STR evidence data. fsa genetic analyzer files • Replicate computer runs for each item • 2 or 3 unknown mixture contributors • Victim genotype was considered Evidence genotypes probability distributions

DNA mixture data Quantitative peak heights at a locus peak size peak height

DNA mixture data Quantitative peak heights at a locus peak size peak height

True. Allele® Casework View. Station User Client Visual User Interface VUIer™ Software Database Server

True. Allele® Casework View. Station User Client Visual User Interface VUIer™ Software Database Server Interpret/Match Expansion Parallel Processing Computers

Mixture weight Separate mixture data into two contributor components 25% 75%

Mixture weight Separate mixture data into two contributor components 25% 75%

Genotype inference Thorough: consider every possible genotype solution Objective: does not know the comparison

Genotype inference Thorough: consider every possible genotype solution Objective: does not know the comparison genotype Explain the peak pattern Better explanation has a higher likelihood Victim's allele pair Another person's allele pair

Genotype inference Explain the peak pattern Worse explanation has a lower likelihood Victim's allele

Genotype inference Explain the peak pattern Worse explanation has a lower likelihood Victim's allele pair Another person's allele pair

Genotype separation minor contributor major contributor

Genotype separation minor contributor major contributor

Genotype concordance

Genotype concordance

True. Allele report Genotype probability distributions Evidence genotype Suspect genotype Likelihood ratio (LR) DNA

True. Allele report Genotype probability distributions Evidence genotype Suspect genotype Likelihood ratio (LR) DNA match statistic Population genotype

DNA match statistic 30 x 98% Probability(evidence match) Probability(coincidental match) 3%

DNA match statistic 30 x 98% Probability(evidence match) Probability(coincidental match) 3%

Match statistic at 15 loci

Match statistic at 15 loci

True. Allele DNA match LR match to Defendant Underpants Black Caucasian Hispanic 36. 6

True. Allele DNA match LR match to Defendant Underpants Black Caucasian Hispanic 36. 6 quintillion 20. 7 quadrillion 212 quadrillion Pajama pants Black Caucasian Hispanic 319 thousand 3. 86 thousand 32. 9 thousand

Powers of Ten 1 000 … 000 th ou sa m nd illi on

Powers of Ten 1 000 … 000 th ou sa m nd illi on bi llio n tri llio n qu ad ril qu lion in til lio n number of zeros -21 -18 -15 -12 -9 -6 -3 0 +3 +6 +9 +12 +15 +18 +21 logarithmic scale

Trial preparation • discuss case report • direct examination • curriculum vitae • Power.

Trial preparation • discuss case report • direct examination • curriculum vitae • Power. Point slides • background reading • answer questions

Computer Interpretation of Quantitative DNA Evidence Commonwealth v Defendant April, 2012 Arlington, Virginia Mark

Computer Interpretation of Quantitative DNA Evidence Commonwealth v Defendant April, 2012 Arlington, Virginia Mark W Perlin, Ph. D, MD, Ph. D Cybergenetics, Pittsburgh, PA Cybergenetics © 2003 -2012

DNA genotype A genetic locus has two DNA sentences, one from each parent. locus

DNA genotype A genetic locus has two DNA sentences, one from each parent. locus mother allele 1 2 3 4 5 6 7 8 ACGT repeated word father allele 1 2 3 4 5 6 7 8 9 An allele is the number of repeated words. A genotype at a locus is a pair of alleles. 8, 9 Many alleles allow for many allele pairs. A person's genotype is relatively unique.

DNA evidence interpretation Evidence item Lab Evidence data Infer Evidence genotype 10, 12 @

DNA evidence interpretation Evidence item Lab Evidence data Infer Evidence genotype 10, 12 @ 50% 11, 12 @ 30% 12, 12 @ 20% 10 11 12 Compare Known genotype 10, 12

Computers can use all the data Quantitative peak heights at locus Penta E peak

Computers can use all the data Quantitative peak heights at locus Penta E peak size peak height

How the computer thinks Consider every possible genotype solution Victim's allele pair Explain the

How the computer thinks Consider every possible genotype solution Victim's allele pair Explain the peak pattern Better explanation has a higher likelihood Another person's allele pair

Evidence genotype Objective genotype determined solely from the DNA data. Never sees a suspect.

Evidence genotype Objective genotype determined solely from the DNA data. Never sees a suspect. 98% 1%

DNA match information How much more does the suspect match the evidence than a

DNA match information How much more does the suspect match the evidence than a random person? 30 x 98% Probability(evidence match) Probability(coincidental match) 3%

Match information at 15 loci

Match information at 15 loci

Is the suspect in the evidence? A match between the underpants and Defendant is:

Is the suspect in the evidence? A match between the underpants and Defendant is: 36. 6 quintillion times more probable than a coincidental match to an unrelated Black person 20. 7 quadrillion times more probable than a coincidental match to an unrelated Caucasian person 212 quadrillion times more probable than a coincidental match to an unrelated Hispanic person

Is the suspect in the evidence? A match between the pajama pants and Defendant

Is the suspect in the evidence? A match between the pajama pants and Defendant is: 319 thousand times more probable than a coincidental match to an unrelated Black person 3. 86 thousand times more probable than a coincidental match to an unrelated Caucasian person 32. 9 thousand times more probable than a coincidental match to an unrelated Hispanic person

Outcome Guilty • object sexual penetration • two counts of aggravated sexual battery Sentence

Outcome Guilty • object sexual penetration • two counts of aggravated sexual battery Sentence • 22 years imprisonment Court of Appeals • DNA chain of custody • appeal denied

True. Allele mixture validation: Virginia case study Establish the reliability of True. Allele mixture

True. Allele mixture validation: Virginia case study Establish the reliability of True. Allele mixture interpretation Mark W Perlin, Ph. D, MD, Ph. D Kiersten Dormer, MS and Jennifer Hornyak, MS Cybergenetics, Pittsburgh, PA Lisa Schiermeier-Wood, MS and Susan Greenspoon, Ph. D Department of Forensic Science, Richmond, VA

Case composition • 72 criminal cases • 92 evidence items • 111 genotype comparisons

Case composition • 72 criminal cases • 92 evidence items • 111 genotype comparisons Criminal offense • 18 homicide • 12 robbery • 6 sexual assault • 20 weapon

DNA mixture distribution

DNA mixture distribution

Data summary – “alleles” Over threshold, peaks are labeled as allele events All-or-none allele

Data summary – “alleles” Over threshold, peaks are labeled as allele events All-or-none allele peaks, each given equal status Threshold Allele Pair 7, 7 7, 10 7, 12 7, 14 10, 10 10%10, 12 10, 14 12, 12 12, 14 14, 14

CPI information Combined probability of inclusion Nothing reported 2. 26 25 6. 70 CPI

CPI information Combined probability of inclusion Nothing reported 2. 26 25 6. 70 CPI

SWGDAM 2010 guidelines Higher threshold for human review Under threshold, alleles less used Threshold

SWGDAM 2010 guidelines Higher threshold for human review Under threshold, alleles less used Threshold Allele Pair 7, 7 7, 10 7, 12 7, 14 10, 10 0%10, 12 10, 14 12, 12 12, 14 14, 14

Modified CPI information 2. 26 56 Nothing reported 1. 75 25 2. 12 6.

Modified CPI information 2. 26 56 Nothing reported 1. 75 25 2. 12 6. 70 CPI m. CPI

SWGDAM 2010 guidelines 3. 2. 2. If a stochastic threshold based on peak height

SWGDAM 2010 guidelines 3. 2. 2. If a stochastic threshold based on peak height is not used in the evaluation of DNA typing results, the laboratory must establish alternative criteria (e. g. , quantitation values or use of a probabilistic genotype approach) for addressing potential stochastic amplification. The criteria must be supported by empirical data and internal validation and must be documented in the standard operating procedures. Use True. Allele® Casework for DNA mixture statistics

Validated genotyping method Perlin MW, Sinelnikov A. An information gap in DNA evidence interpretation.

Validated genotyping method Perlin MW, Sinelnikov A. An information gap in DNA evidence interpretation. PLo. S ONE. 2009; 4(12): e 8327. Perlin MW, Legler MM, Spencer CE, Smith JL, Allan WP, Belrose JL, Duceman BW. Validating True. Allele® DNA mixture interpretation. Journal of Forensic Sciences. 2011; 56(6): 1430 -47. Perlin MW, Belrose JL, Duceman BW. New York State True. Allele® Casework validation study. Journal of Forensic Sciences. 2013; 58(6): in press.

True. Allele reinterpretation Virginia reevaluates DNA evidence in 375 cases July 16, 2011 “Mixture

True. Allele reinterpretation Virginia reevaluates DNA evidence in 375 cases July 16, 2011 “Mixture cases are their own little nightmare, ” says William Vosburgh, director of the D. C. police’s crime lab. “It gets really tricky in a hurry. ” “If you show 10 colleagues a mixture, you will probably end up with 10 different answers” Dr. Peter Gill, Human Identification E-Symposium, 2005

Mixture weight Separate mixture data into two contributor components 25% 75%

Mixture weight Separate mixture data into two contributor components 25% 75%

Genotype inference Thorough: consider every possible genotype solution Objective: does not know the comparison

Genotype inference Thorough: consider every possible genotype solution Objective: does not know the comparison genotype Allele Pair Victim's allele pair 7, 7 7, 10 Explain the 7, 12 Another person's peak pattern 7, 14 allele pair 10, 10 98%10, 12 10, 14 12, 12 Better 12, 14 explanation 14, 14 has a higher likelihood

True. Allele sensitivity 2. 26 56 Nothing reported 1. 75 5. 52 25 9

True. Allele sensitivity 2. 26 56 Nothing reported 1. 75 5. 52 25 9 2. 12 6. 70 10. 93 CPI m. CPI True. Allele

True. Allele specificity True exclusions, without false inclusions – 19. 69

True. Allele specificity True exclusions, without false inclusions – 19. 69

True. Allele reproducibility log(LR 2) Concordance in two independent computer runs standard deviation (within-group)

True. Allele reproducibility log(LR 2) Concordance in two independent computer runs standard deviation (within-group) 0. 305 log(LR 1)

Validation results True. Allele® Casework DNA mixture interpretation is: A reliable method • sensitive

Validation results True. Allele® Casework DNA mixture interpretation is: A reliable method • sensitive • specific • reproducible True. Allele computer genotyping is more effective than human review

True. Allele Virginia outcomes 144 cases analyzed 72 case reports – 10 trials City

True. Allele Virginia outcomes 144 cases analyzed 72 case reports – 10 trials City Court Charge Sentence Richmond Federal Weapon 50 years Alexandria Federal Bank robbery 90 years Quantico Military Rape Chesapeake State Robbery 26 years Arlington State Molestation 22 years Richmond State Homicide 35 years Fairfax State Abduction 33 years Norfolk State Homicide 8 years Charlottesville State Homicide 15 years Hampton State Home invasion 3 years 5 years

True. Allele in Virginia • Department of Forensic Science has their own True. Allele

True. Allele in Virginia • Department of Forensic Science has their own True. Allele system • Training, validation, approvals • Services centralized in Richmond • DFS will provide DNA mixture statistics and court testimony

True. Allele in the United States Casework system Interpretation services

True. Allele in the United States Casework system Interpretation services

More information http: //www. cybgen. com/information • Courses • Newsletters • Newsroom • Presentations

More information http: //www. cybgen. com/information • Courses • Newsletters • Newsroom • Presentations • Publications perlin@cybgen. com