CS 405 G Introduction to Database Systems Instructor

CS 405 G: Introduction to Database Systems Instructor: Jinze Liu

Review l The unit of disk read and write is l l Block (or called Page) The disk access time is composed by l l l Seek time Rotation time Data transfer time 10/30/2021 Jinze Liu @ University of Kentucky 2

Review l A row in a table, when located on disks, is called l l A record Two types of record: l l Fixed-length Variable-length 10/30/2021 Jinze Liu @ University of Kentucky 3

Review l In an abstract sense, a file is l l In reality, a file is l l A set of disk pages Each record lives on l l A set of “records” on a disk A page Physical Record ID (RID) l A tuple of <page#, slot#> 10/30/2021 Jinze Liu @ University of Kentucky 4

Today’s Topic l l l How to locate data in a file fast? Introduction to indexing Tree-based indexes l l ISAM: Indexed sequence access method B+-tree 10/30/2021 Jinze Liu @ University of Kentucky 5

Basics l Given a value, locate the record(s) with this value SELECT * FROM R WHERE A = value; SELECT * FROM R, S WHERE R. A = S. B; l Other search criteria, e. g. l l Range search SELECT * FROM R WHERE A > value; Keyword search database indexing 10/30/2021 Jinze Liu @ University of Kentucky Search 6

Dense and sparse indexes l Dense: one index entry for each search key value l Sparse: one index entry for each block l Records must be clustered according to the search key 10/30/2021 Jinze Liu @ University of Kentucky 7

Dense versus sparse indexes l Index size l l Requirement on records l l Records must be clustered for sparse index Lookup l l l Sparse index is smaller and may fit in memory Dense index can directly tell if a record exists Update l Easier for sparse index 10/30/2021 Jinze Liu @ University of Kentucky 8

Primary and secondary indexes l Primary index l l Secondary index l l Created for the primary key of a table Records are usually clustered according to the primary key Can be sparse Usually dense SQL l PRIMARY KEY declaration automatically creates a primary index, UNIQUE key automatically creates a secondary index l Additional secondary index can be created on non-key attribute(s) CREATE INDEX Student. GPAIndex ON Student(GPA); 10/30/2021 Jinze Liu @ University of Kentucky 9

Tree-Structured Indexes: Introduction l l Tree-structured indexing techniques support both range selections and equality selections. ISAM =Indexed Sequential Access Method l l static structure; early index technology. B+ tree: dynamic, adjusts gracefully under inserts and deletes. 10/30/2021 Jinze Liu @ University of Kentucky 10

Motivation for Index l l ``Find all students with gpa > 3. 0’’ l If data file is sorted, do binary search l Cost of binary search in a database can be quite high, Why? Simple idea: Create an `index’ file. index entry P 0 Page 1 K 1 P 1 Index File k. N k 1 k 2 K 2 Page 3 K m Pm Page N Data File Can do binary search on (smaller) index file! 10/30/2021 Jinze Liu @ University of Kentucky 11

ISAM l What if an index is still too big? l F Put a another (sparse) index on top of that! ISAM (Index Sequential Access Method), more or less Example: look up 197 100, 200, …, 901 Index blocks 100, 123, …, 192 200, … … 901, …, 996 100, 108, 123, 129, 192, 197, 200, 202, 901, 907, 996, 997, … … …… 119, 121 … … Data blocks 10/30/2021 Jinze Liu @ University of Kentucky 12

Updates with ISAM Example: insert 107 Example: delete 129 100, 200, …, 901 Index blocks 100, 123, …, 192 200, … … 901, …, 996 100, 108, 123, 129, 192, 197, 200, 202, 901, 907, 996, 997, … … …… 119, 121 … … Data blocks 107 Overflow block l Overflow chains and empty data blocks degrade performance l Worst case: most records go into one long chain 10/30/2021 Jinze Liu @ University of Kentucky 13

A Note of Caution l ISAM is an old-fashioned idea l l l B+-trees are usually better, as we’ll see But, ISAM is a good place to start to understand the idea of indexing Upshot l l 10/30/2021 Don’t brag about being an ISAM expert on your resume Do understand how they work, and tradeoffs with B+trees Jinze Liu @ University of Kentucky 14

B+-tree Max fan-out: 4 10/30/2021 120 130 101 110 30 35 3 5 11 30 120 150 180 100 l Jinze Liu @ University of Kentucky 180 200 l A hierarchy of intervals Balanced (more or less): good performance guarantee Disk-based: one node per block; large fan-out 150 156 179 l 15

Sample B+-tree nodes Non-leaf 120 150 180 to keys 100 · k Max fan-out: 4 Leaf 120 130 to keys 100 <=k < 120 <=k < 150 <= k < 180 to keys 180 <= k to next leaf node in sequence to records with these k values; or, store records directly in leaves 10/30/2021 Jinze Liu @ University of Kentucky 16

B+-tree balancing properties l l Height constraint: all leaves at the same lowest level Fan-out constraint: all nodes at least half full (except root) Max # pointers keys Non-leaf f f– 1 Root f f– 1 Leaf f f– 1 10/30/2021 Min # active pointers keys 2 Jinze Liu @ University of Kentucky 1 17

Lookups SELECT * FROM R WHERE k = 179; SELECT * FROM R WHERE k = 32; 30 120 150 180 100 Max fan-out: 4 10/30/2021 Jinze Liu @ University of Kentucky 180 200 156 179 120 130 101 110 30 35 3 5 11 Not found 18

Range query SELECT * FROM R WHERE k > 32 AND k < 179; 30 120 150 180 100 Max fan-out: 4 180 200 156 179 120 130 101 110 30 35 3 5 11 Look up 32… And follow next-leaf pointers 10/30/2021 Jinze Liu @ University of Kentucky 19

Insertion Insert a record with search key value 32 Max fan-out: 4 180 200 156 179 120 150 180 120 130 101 110 Look up where the inserted key should go… 30 32 35 3 5 11 30 100 l And insert it right there 10/30/2021 Jinze Liu @ University of Kentucky 20

Another insertion example Insert a record with search key value 152 Max fan-out: 4 180 200 152 156 179 120 130 120 150 180 100 101 110 l Oops, node is already full! 10/30/2021 Jinze Liu @ University of Kentucky 21

Node splitting 10/30/2021 Jinze Liu @ University of Kentucky 180 200 Yikes, this node is also already full! 156 179 150 152 120 130 101 110 120 156 180 100 Max fan-out: 4 22

More node splitting l 180 200 156 179 150 152 120 130 101 110 120 150 180 100 156 Max fan-out: 4 In the worst case, node splitting can “propagate” all the way up to the root of the tree (not illustrated here) l Splitting the root introduces a new root of fan-out 2 and causes the tree to grow “up” by one level Jinze Liu @ University of Kentucky 10/30/2021 23

Insertion l B+-tree Insert l Find correct leaf L. Put data entry onto L. l If L has enough space, done! l Else, must split L (into L and a new node L 2) l Distribute entries evenly, copy up middle key. l Insert index entry pointing to L 2 into parent of L. This can happen recursively Tree growth: gets wider and (sometimes) one level taller at top. l l l 10/30/2021 Jinze Liu @ University of Kentucky 24

Deletion Delete a record with search key value 130 180 200 120 130 If a sibling has more than enough keys, steal one! 150 156 179 120 150 180 100 Max fan-out: 4 Look up the key to be deleted… 100 101 110 l And delete it Oops, node is too empty! 10/30/2021 Jinze Liu @ University of Kentucky 25

Stealing from a sibling 10/30/2021 Jinze Liu @ University of Kentucky 180 200 156 179 120 150 101 110 Remember to fix the key in the least common ancestor 120 156 180 100 Max fan-out: 4 26

Another deletion example Delete a record with search key value 179 Max fan-out: 4 180 200 156 179 120 150 120 156 180 100 101 110 l Cannot steal from siblings Then coalesce (merge) with a sibling! 10/30/2021 Jinze Liu @ University of Kentucky 27

Coalescing l 156 180 200 120 150 101 110 Remember to delete the appropriate key from parent 120 156 180 100 Max fan-out: 4 Deletion can “propagate” all the way up to the root of the tree (not illustrated here) l When the root becomes empty, the tree “shrinks” by one level 10/30/2021 Jinze Liu @ University of Kentucky 28

Deletion l B+-tree Delete l Start at root, find leaf L where entry belongs. l Remove the entry. l If L is at least half-full, done! l If L has only d-1 entries, l Try to redistribute, borrowing from sibling (adjacent node with same parent as L). l If re-distribution fails, merge L and sibling. l If merge occurred, must delete entry (pointing to L or sibling) from parent of L. l Tree shrink: gets narrower and (sometimes) one level lower at top. 10/30/2021 Jinze Liu @ University of Kentucky 29

Example B+ Tree - Inserting 8* Root 17 13 5 2* 3* 7* 24 30 24 13 5* 5* 17 7* 8* 14* 16* 19* 20* 22* 30 33* 34* 38* 39* 24* 27* 29* 33* 34* 38* 39* Notice that root was split, leading to increase in height. In this example, we can avoid split by re-distributing entries; however, this is usually not done in practice. 10/30/2021 Jinze Liu @ University of Kentucky 30

Example Tree (including 8*) Delete 19* and 20*. . . Root 13 5 2* 2* 3* 3* 10/30/2021 5* 7* 17 17 24 30 24 13 5* 14* 16* 7* 8* 14* 16* 30 24* 27* 29* 19* 20* 22* 24* 27* 29* Jinze Liu @ University of Kentucky 33* 34* 38* 39* 31

Example Tree (including 8*) Delete 19* and 20*. . . Root 17 17 5 5 2* 2* 3* 3* l l 10/30/2021 27 24 13 13 5* 5* 7* 7* 8* 8* 14* 16* 22* 19* 24* 20* 22* 30 30 27* 24* 29* 27* 29* 38* 39* 33* 34* 38* 39* Deleting 19* is easy. Deleting 20* is done with re-distribution. Notice how middle key is copied up. Jinze Liu @ University of Kentucky 32

. . . And Then Deleting 24* l l Must merge. Observe `toss’ of index entry (key 27 on right), and `pull down’ of index entry (below). 30 22* 27* 29* 33* 34* 38* 39* Root 5 2* 3* 10/30/2021 5* 7* 8* 13 14* 16* 17 30 22* 27* 29* Jinze Liu @ University of Kentucky 33* 34* 38* 39* 33

Performance analysis l How many I/O’s are required for each operation? l l l h, the height of the tree (more or less) Plus one or two to manipulate actual records Plus O(h) for reorganization (should be very rare if f is large) Minus one if we cache the root in memory How big is h? l l Roughly logfan-out N, where N is the number of records B+-tree properties guarantee that fan-out is least f / 2 for all non-root nodes Fan-out is typically large (in hundreds)—many keys and pointers can fit into one block A 4 -level B+-tree is enough for typical tables 10/30/2021 Jinze Liu @ University of Kentucky 34

B+-tree in practice l Complex reorganization for deletion often is not implemented (e. g. , Oracle, Informix) l l Leave nodes less than half full and periodically reorganize Most commercial DBMS use B+-tree instead of hashingbased indexes because B+-tree handles range queries 10/30/2021 Jinze Liu @ University of Kentucky 35

The Halloween Problem l Story from the early days of System R… UPDATE Payroll SET salary = salary * 1. 1 WHERE salary >= 100000; l l l There is a B+-tree index on Payroll(salary) The update never stopped (why? ) Solutions? l l Scan index in reverse Before update, scan index to create a complete “to-do” list During update, maintain a “done” list Tag every row with transaction/statement id 10/30/2021 Jinze Liu @ University of Kentucky 36

B+-tree versus ISAM l l ISAM is more static; B+-tree is more dynamic ISAM is more compact (at least initially) l l Fewer levels and I/O’s than B+-tree Overtime, ISAM may not be balanced l Cannot provide guaranteed performance as B+-tree does 10/30/2021 Jinze Liu @ University of Kentucky 37

B+-tree versus B-tree l B-tree: why not store records (or record pointers) in nonleaf nodes? l l These records can be accessed with fewer I/O’s Problems? l l l Storing more data in a node decreases fan-out and increases h Records in leaves require more I/O’s to access Vast majority of the records live in leaves! 10/30/2021 Jinze Liu @ University of Kentucky 38

Beyond ISAM, B-, and B+-trees l l Other tree-based indexes: R-trees and variants, Gi. ST, etc. Hashing-based indexes: extensible hashing, linear hashing, etc. Text indexes: inverted-list index, suffix arrays, etc. Other tricks: bitmap index, bit-sliced index, etc. l How about indexing subgraph search? 10/30/2021 Jinze Liu @ University of Kentucky 39

Summary l Two types of queries l l l Key-search Range-query B+-tree operations l l Search Insert l l Delete l l Split child Redistribution B+-tree sorting l Next: disk-based sorting algorithms 10/30/2021 Jinze Liu @ University of Kentucky 40