CERNITDB Brief Introduction to Bitmap Indices for Scientific
CERN/IT/DB Brief Introduction to Bitmap Indices for Scientific Data Kurt Stockinger CERN, IT-Division, Database Group Geneva, Switzerland Database Workshop, July 11 -13, Geneva, Switzerland
CERN/IT/DB Features of Bitmap Indices q. Multi-dim. index data structure which is optimised for read-only data q“Good” performance for multi-dim. queries with low selectivity (few records result from the query) q. Applied in Data Warehouses and Decision Support Systems (e. g. Oracle, Informix, Sybase)
CERN/IT/DB Encoding Techniques for Discrete Attribute Values a) list of attributes b) equality encoding c) range encoding Attribute cardinality = 10 Range encoding optimised for one-sided range queries, e. g. a 0 <= 3
CERN/IT/DB Pros and Cons of Bitmap Indices (BMI) q Pros: § Easy to build and to maintain § Easy to identify records that satisfy a complex multiattribute predicate (multi-dim. ad-hoc queries) § Bit-wise operators (AND, OR, XOR, NOT) are very efficiently supported by HW § Very space efficient for attributes with low cardinality (number of distinct attribute values, e. g. “Yes”, “No”) q Cons: § Space inefficient for attributes with high cardinality § Commercial database systems only “efficiently” support bitmap indices for discrete attribute values
CERN/IT/DB Example: Bitmap Indices for HEP Data attribute indices (bit matrices) Events (bit vectors) bins (bit slices)
CERN/IT/DB 2 -Sided Range Query q. E. g. : (p. T > 25. 7) && (p. T < 91. 8) 1) Candidate slices 3) OR 2)Hit slices 5) “Check” 4) OR Bin ranges: [0; 20) [20; 40) [40; 60) [60; 80) [80; 100). . .
- Slides: 6