Creating and Maintaining Databases Dr Pushkin Kachroo Enrollment
Creating and Maintaining Databases Dr. Pushkin Kachroo
Enrollment • Collect Private Information, e. g. fingerprint • Follow “enrollment policy” • Policy should be: – acceptable to the public – Clear on how, where and when the private info will be used
Enrollment Steps • Positive Enrollment: – Trusted Individuals – Enrollment Policy EM – Authentication through: • Seed Documents (Birth Cert. , passport) – Store machine representation of the enrolled in Verification Database M
Enrollment Steps • Negative Enrollment: – Criminal Identification – Enrollment Policy EN – Store machine representation of the enrolled in Screening Database N
General Enrollment • Target Population: World W • Ground Truth: legacy databases: – Criminal or civil – Can contain Fake and Duplicate Identities
Fake Identity • Created Identity – Non-existent person – Biometric screening against criminal databases might catch the “fake” • Stolen Identity
The Zoo • Sheep: – Real world biometric distinctive and stable • Goats: – Difficult to authenticate • Lambs: – Enrolled that are easy to imitate (cause passive FA) • Wolves: – Good at imitating (cause active FA) • Chameleons: – Easy to imitate and are good at imitating
Sample Quality Control • Random False Reject/Accept caused by Adverse Signal Acquisition • Solution – Better User Interface – Better model probabilistic into feature extraction/matching – Interactively improve input
Quality Control • Define “desirable” • Quality related to process-ability • Quantify quality to decide action based on the level of quality, e. g. present info differently, apply image enhancement etc. • Compromise between convenience and quality – Affects FTE, and also FA and FR • ROC can be improved by eliminating poor data
FNMR (False Non-match Rate) ROC-Quality Control Throw out bad data FMR (False Match Rate)
Training • Like Machine Learning • Relate scores to probability that the biometric matches someone or doesn’t Training Testing
Enrollment as System Training • Assigning IDs to Subjects • Three possibilities – Correct – Someone faking enrolled (duplicate) – Someone faking unenrolled (fake) – PD=Prob(duplicate) – PF=Prob(fake)
Database Integrity • How well database reflects the truth data • Database duplication: Purge detected duplicates • PD=FNMRE X PDEA – Prob of duplicate= Match bet. 2 samples not detected; double enroll • PF=FMRE X PIA – Prob of fake enroll= Match bet. 2 samples falsely detected; Impersonation attack
FNMR (PD. . ) PD-PF FMR (PF…)
Probabilistic Enrollment • Enrollment Process Goal: – Build access control for from that are authorized – Likelihood of d_i given stored token B_i
Probabilistic Enrollment • Enrollment Process Goal: – Machine representation of the “real” biometric • Assumption about score : likelihood that we have the same subject – True if –. equivalently
Probabilistic Enrollment. . • For realistic assumptions we need to model the world • Probability can be approximated unrealistically by • We need (given biomeric data collected during enrollment, O)
Modeling the World-1 Prior probability that subject d_i is present Prior probability that this observation will occur Modeling numerator on right is a matter of fitting model to data; rest impractical/impossible
Modeling the World-2 • Cohorts – Models of most similar subjects • World Modeling: – Reduce cohorts to a single model
Modeling the World-3 For Cohort Modeling
Updating Probabilities
Use of Probabilities • Accuracy improvements • Define measure of biometric integrity • Integrity of different biometrics can be combined etc.
- Slides: 22