Biometric Databases 1 Overview Problems associated with Biometric
Biometric Databases 1
Overview • Problems associated with Biometric databases • Some practical solutions • Some existing DBMS 2
Problems • Maintaining a huge Biometric database may cause scalability problems • Matching time increases with the increase in database sizes • Biometric data has no natural ordering • Matching should be fast for a real-time system 3
Need for a DBMS in Biometrics • Every large scale Biometrics Solution requires a RDBMS for efficient storage and access of data • Examples : AIFS – contains 400 million fingerprints Point-of-sale Biometric identification system (100 million entries) 4
Indexing • Why indexing data? § § To accelerate Query Execution Reduce the number of disk access • Many solutions to speed up query processing: § Summary Tables (Not good for Ad-Hoc Queries) § parallel Machines (add additional Hardware --> cost) § Indexes (The Key to achieve this objective) • Strong demand for efficient processing of complex queries on huge databases. 5
Indexing Issues contd. . • Factors used to determine which indexing technique should be built on a Column: § Characteristics of indexed column o Cardinality Data o Distribution o Value range § Understanding the Data and the Usage • Developing a new Indexing technique for Data warehouse’s Queries § The index should be small and utilize space efficiently. § The Index should be able to operate with other indexes. § The Index should support Ad-Hoc and complex Queries and speed up join operations 6 § The Index should be easy to build implement and maintain.
Binning • Originates from network information theory • It is division of set of code words (or templates) into • • • subsets(“bins”) such that each bin satisfies some properties depending upon the application. . is a way to segment the biometric templates, e. g. , Male/Female Particular Finger Loop vs. whorl vs. arch may be another biometric 7
Binning --contd. . • Increases search performance, may reduce search accuracy(increases false non match ratio) • Search for a matching template may fail owing to an incorrect bin placement • May have to include the same template in different bins • Bin error rate is related to confidence in binning strategy 8
Architecture Details Loose to Tight Integration 9
Using the RDBMS • Loose Integration – Use the RDBMS only for storage of templates – Match performed against in-memory structures created from the stored templates – Users use Biometric vendor-specific API or Bio. API • Tight Integration – Use the RDBMS for storage of templates as well as for performing the match – Users use SQL queries directly against database tables 10
Loose Integration • Biometric data is loaded from a database table into memory • Matching done on custom-built memory-based structures – (+) Results in fast matching – (-) The solution is memory-bound • Further scalability, achieved by using Server Farms – (-) Vendor-centric solution – (-) Can not be easily extended to support multimodal systems 11
Tight Integration • Template matching is implemented within the RDBMS and performed using SQL • Allows Biometric Vendor to exploit full capabilities of RDBMS including – Security – Scalability and availability – Parallelism 12
Tight Integration – Template Storage • A Biometric Template can be stored in a table column as – RAW data type – Simple Object data type – XML data type – Full Common Biometric Exchange File Format-compliant (CBEFF) data type 13
Tight Integration – A basic approach • Biometric Vendors define SQL operators – Identify. Match() Given an input template, returns all the templates which match the input within a certain threshold (defined as primary operator) – Score() Returns the degree of match of the input template with a stored template (defined as ancillary to Identify. Match operator) • Biometric Vendors define implementations for these operators which are specific to their biometric 14
Tight Integration - Indexing • Biometric Vendors define an indexing scheme (indextype) for fast evaluation of the Identify. Match() operator • Defining an indexing scheme involves – Developing a filter(s) which will quickly eliminate a large number of non-matching templates – An exact match is performed against the resulting (smaller) set of templates 15
A Fingerprint Example • Create a table to store employee data along with their fingerprint template CREATE TABLE Employees (name VARCHAR 2(128), employee_id INTEGER, dept VARCHAR 2(30), fingerprint_template RAW(1024)); • Index the column storing fingerprint data, for faster access CREATE INDEX Fingerprint. Index ON employees (fingerprint_template) INDEXTYPE IS Fingerprint. Index. Type; • Retrieve the names and match scores for all employees whose fingerprint matches the input fingerprint SELECT name, Score(1) FROM Employees WHERE Identify. Match(fingerprint_template, <input>, 1) > 0; 16
Fingerprint Indexing • Possible indexing approach involves – classifying the fingerprints as (Left Loop, Right Loop, Whorl, and other) types • Query involves – classifying the input fingerprint into one of these classes – performing exact matches against fingerprints of that class 17
Basic Indexing approach • Build an auxiliary structure (table) that stores extracted portions of the template information along with the unique row identifiers of the base table • Build native bitmap or B-tree indexes on the auxiliary structure • A query on this table models the filter that returns a set of row identifiers for which the pair-wise match is performed 18
Indexing Challenges • It may not always be possible to develop filter(s) to reduce the search space • It might be difficult to beat in-memory matching algorithm 19
Supporting Multi Biometric Applications • Why multi-modal biometrics? – Accuracy of a single biometric may be less than desired – If one of the traits is altered, user can still be recognized based on other traits 20
Combining Scores in Multi Biometrics CREATE TABLE Employees (id INTEGER, fingerprint_template RAW(1024), face_template RAW(1024)); SELECT Score(1) , Score(2) FROM Employees WHERE Identify. Match (fingerprint_template, <input-fp>, 1) >0 AND Identify. Match(face_template, <input-face>, 2) > 0; SELECT Score(1) , Score(2) FROM Employees WHERE (Identify. Match(fingerprint_template, <input-fp>, 1) >0 OR Identify. Match(face_template, <input-face>, 2) > 0) AND Score(1) + Score(2) >1; 21
Loose Vs. Tight Integration Loose • Memory-based solution; can be fairly efficient and make use of pointers • Memory bound • Must custom-build features for large scale handling • Does not need to know about additional DBMS features Tight • Caching tables/indexes can help; however incurs buffer cache overhead • Not memory bound • Can exploit the features of RDBMS, such as Partitioning, Parallelism, and Security • Requires understanding of DBMS functionality and extensibility 22
Loose vs. Tight Integration (cont. ) • Index structures can be • Coming up with index pure memory-based structures can be challenging • Difficult to combine • Can combine with relational predicates • Difficult to support multimodal • Easily extends to handle applications multi-modal applications 23
- Slides: 23