Memory Management DBMS Application SETORIENTED Database Access Methods

Memory Management. DBMS Application . SET-ORIENTED Database Access Methods TUPLEORIENTED Main Online External Nearline External Buffer Manages BLOCK-ORIENTED Manages YV - B+ Trees and Physical Design Logging and Recovery File Manager Transaction Programs Tuple Management Associative Access Record Management Buffer Management File Management Archive Manager 327

Index Structures n n n A (single-level) index is an auxiliary file structure that makes it more efficient to search for a record in the file The index is usually specified on one field of the file The index is often called access path on the field An index file usually occupies much less space than the actual data file (the entries are fewer and smaller) A pointer to the block where a record is stored is found with binary search on the index FORM of a record in the index file: key value YV - B+ Trees and Physical Design pointer to record’s block 328

Indexes n Single-level Indexes can be distinguished in: – Primary Index: defined on a file which is ordered on the key field. It includes one entry for each file block. The entry has the key field value for the first record in the block. (also called, sparse index) – Clustering Index: defined on a file which is ordered on a non-key field. It includes one index entry for each distinct value of the field. The entry points to the first data block that contain records with that field value. – Secondary Index: defined on a file which is unordered. Contains one entry per each record in the data file (also called, dense index) YV - B+ Trees and Physical Design 329

Multi-Level Indexes n Index files are simply files, therefore they can also be indexed. n We thus end-up with a hierarchy of index structures (first level, second level, etc. ) n Every level of the index is an ordered file, therefore, deletion and insertion require much work (maintain the index) n A multi-level index forms a search tree and it is assumed that the top (first) level index fits in one disk block YV - B+ Trees and Physical Design 330

Indexed Sequential Access Method (ISAM) n n n ISAM is a multi-level index structure (tree) for direct access to tuples in a relation sorted on some attribute Each node of the tree is a page (block) on the disk The nodes of the tree hold <key value, pointer > pairs, sorted on key value. The internal nodes point to lower nodes in the tree (a lower level index), while the leaf nodes point to pages in the relation. A pointer points to a subtree with key values greater or equal the corresponding key value and less than the key value for the next pointer 9 YV - B+ Trees and Physical Design 18 55 60 78 90 331

ISAM Example n ISAM on Salary 10 20 30 10 40 40 50 60 Gary 10 Mike 15 LIsa 20 Shirley 25 Bob 30 Robin 35 Ivan Bill 40 45 Ron Jill 50 55 Scott 43 OVERFLOW Keith 60 Dan 65 YV - B+ Trees and Physical Design 332

Example ISAM Tree n Each node can hold 2 entries Root 40 10* 15* 20 33 20* 27* YV - B+ Trees and Physical Design 51 33* 37* 40* 46* 51* 63 55* 63* 97* 333

After Inserting 23*, 48*, 41*, 42*. . . Root 40 Index Pages 20 33 20* 27* 51 63 Primary Leaf 10* 15* 33* 37* 40* 46* 48* 41* Pages Overflow 23* 51* 55* 63* 97* Pages 42* YV - B+ Trees and Physical Design 334

. . . Then Deleting 42*, 51*, 97* Root 40 20 10* 15* 20* 23* 33 27* 51 33* 37* 40* 48* 46* 63 55* 63* 41* * Note that 51* appears in index levels, but not in leaf! YV - B+ Trees and Physical Design 335

ISAM Performance n PERFORMANCE. Suppose that we have D data pages and k pointers in each node (assume that D is a power of k, say D = k. L ) LOOKUP COST: Sequential Scan: D Binary Search on Relation: log 2 D + 1 Binary Search (single index): (log 2 (D/k) + 1 Traversing the ISAM tree: logk D + 1 = L + 1 YV - B+ Trees and Physical Design 336

ISAM Summary n ADVANTAGES: – It provides a sorted (mostly) directory for a file (relation) – It is very good for exact queries (e. g. , Salary = 40000) – ISAM facilitates the execution of range queries (e. g. , Salary between 35000 and 60000) n DISADVANTAGES – It is a static structure and may easily get unbalanced – If there are many updates (volatile data), then the relation does not remain “mostly” sorted – The index does consume some space which may be valuable YV - B+ Trees and Physical Design 337

B+ - Trees n n n n A B+-tree is a multilevel tree-structured directory to index a sorted file Leaf nodes contain tuples in sorted order, the other internal nodes have a special form Nodes correspond to a block (page) Each node is kept between half- and completely-full Insertions in nodes that are not full are very efficient; if a node is full, a split occurs Deletions are very efficient if the node does not become less than half-full The tree structure remains balanced at all times YV - B+ Trees and Physical Design 338

B+ - Trees -- Summary n Form of internal nodes: P 1 K 1 P 2 K 2. . . Pn-1 Kn-1 Pn K 1 <. . . Kn-1 P 1 points to a node containing key values n, n< K 1 P 2 points to a node containing key values n, K 1 < = n < K 1 n Variations of B+-trees: – B-trees : They are like B+-trees, but the internal nodes also contain pointers to data. They can get deeper and are difficult to implement, but occasionally, they are faster. – B*-trees : They are like B+-trees, but keep each node 2/3 full (at least). Shorter trees, faster, but worse for updates YV - B+ Trees and Physical Design 339

B+ Tree: The Most Widely Used Index n n n Insert/delete at log F N cost; keep tree height-balanced. (F = fanout, N = # leaf pages) Minimum 50% occupancy (except for root). Each node contains d <= m <= 2 d entries. The parameter d is called the order of the tree. Supports equality and range-searches efficiently. Index Entries (Direct search) Data Entries ("Sequence set") YV - B+ Trees and Physical Design 340

Example B+ Tree n n Search begins at root, and key comparisons direct it to a leaf (as in ISAM). Search for 5*, 15*, all data entries >= 24*. . . Root 13 2* 3* 5* 7* 14* 16* 17 24 19* 20* 22* 30 24* 27* 29* 33* 34* 38* 39* * Based on the search for 15*, we know it is not in the tree! YV - B+ Trees and Physical Design 341

B+ Trees in Practice n Typical order: 100. Typical fill-factor: 67%. – average fanout = 133 n Typical capacities: – Height 4: 1334 = 312, 900, 700 records – Height 3: 1333 = 2, 352, 637 records n Can often hold top levels in buffer pool: – Level 1 = 1 page = 8 Kbytes – Level 2 = 133 pages = 1 Mbyte – Level 3 = 17, 689 pages = 133 MBytes YV - B+ Trees and Physical Design 342

Inserting a Data Entry into a B+ Tree n n Find correct leaf L. Put data entry onto L. – If L has enough space, done! – Else, must split L (into L and a new node L 2) » Redistribute entries evenly, copy up middle key. » Insert index entry pointing to L 2 into parent of L. n This can happen recursively – To split index node, redistribute entries evenly, but push up middle key. (Contrast with leaf splits. ) n Splits “grow” tree; root split increases height. – Tree growth: gets wider or one level taller at top. YV - B+ Trees and Physical Design 343

Inserting 8* into Example B+ Tree n Observe how minimum occupancy is guaranteed in both leaf and index pg splits. Note difference between copy-up and push-up; be sure you understand the reasons for this. YV - B+ Trees and Physical Design Entry to be inserted in parent node. (Note that 5 iss copied up and continues to appear in the leaf. ) 5 2* 3* 5* 13 8* Entry to be inserted in parent node. (Note that 17 is pushed up and only appears once in the index. Contrast this with a leaf split. ) 17 5 7* 24 30 344

Example B+ Tree After Inserting 8* Root 17 5 2* 3* 24 13 5* 7* 8* 14* 16* 19* 20* 22* 30 24* 27* 29* 33* 34* 38* 39* v Notice that root was split, leading to increase in height. v In this example, we can avoid split by re-distributing entries; however, this is usually not done in practice. YV - B+ Trees and Physical Design 345

Deleting a Data Entry from a B+ Tree n n Start at root, find leaf L where entry belongs. Remove the entry. – If L is at least half-full, done! – If L has only d-1 entries, » Try to re-distribute, borrowing from sibling (adjacent node with same parent as L). » If re-distribution fails, merge L and sibling. n n If merge occurred, must delete entry (pointing to L or sibling) from parent of L. Merge could propagate to root, decreasing height. YV - B+ Trees and Physical Design 346

Example Tree After (Inserting 8*, Then) Deleting 19* and 20*. . . Root 17 5 2* 3* n n 27 13 5* 7* 8* 14* 16* 22* 24* 30 27* 29* 33* 34* 38* 39* Deleting 19* is easy. Deleting 20* is done with re-distribution. Notice how middle key is copied up. YV - B+ Trees and Physical Design 347

. . . And Then Deleting 24* n n Must merge. Observe `toss’ of index entry (on right), and `pull down’ of index entry (below). 30 22* 27* 29* 33* 34* 38* 39* Root 5 2* 3* 5* 7* YV - B+ Trees and Physical Design 8* 13 14* 16* 17 30 22* 27* 29* 33* 34* 38* 39* 348

Example of Non-leaf Re-distribution n n Tree is shown below during deletion of 24*. (What could be a possible initial tree? ) In contrast to previous example, can re-distribute entry from left child of root to right child. Root 22 5 2* 3* 5* 7* 8* YV - B+ Trees and Physical Design 13 14* 16* 17 30 20 17* 18* 20* 21* 22* 27* 29* 33* 34* 38* 39* 349

After Re-distribution n n Intuitively, entries are re-distributed by `pushing through’ the splitting entry in the parent node. It suffices to re-distribute index entry with key 20; we’ve redistributed 17 as well for illustration. Root 17 5 2* 3* 5* 7* 8* YV - B+ Trees and Physical Design 13 14* 16* 20 17* 18* 20* 21* 22 30 22* 27* 29* 33* 34* 38* 39* 350

Prefix Key Compression n n Important to increase fan-out. (Why? ) Key values in index entries only `direct traffic’; can often compress them. – E. g. , If we have adjacent index entries with search key values Dannon Yogurt, David Smith and Devarakonda Murthy, we can abbreviate David Smith to Dav. (The other keys can be compressed too. . . ) » Is this correct? Not quite! What if there is a data entry Davey Jones? (Can only compress David Smith to Davi) » In general, while compressing, must leave each index entry greater than every key value (in any subtree) to its left. n Insert/delete must be suitably modified. YV - B+ Trees and Physical Design 351

Bulk Loading of a B+ Tree n n n If we have a large collection of records, and we want to create a B+ tree on some field, doing so by repeatedly inserting records is very slow. Bulk Loading can be done much more efficiently. Initialization: Sort all data entries, insert pointer to first (leaf) page in a new (root) page. Root 3* 4* Sorted pages of data entries; not yet in B+ tree 6* 9* YV - B+ Trees and Physical Design 10* 11* 12* 13* 20* 22* 23* 31* 35* 36* 38* 41* 44* 352

Summary of Bulk Loading n Option 1: multiple inserts. – Slow. – Does not give sequential storage of leaves. n Option 2: Bulk Loading – – Has advantages for concurrency control. Fewer I/Os during build. Leaves will be stored sequentially (and linked, of course). Can control “fill factor” on pages. YV - B+ Trees and Physical Design 353

A Note on `Order’ n Order (d) concept replaced by physical space criterion in practice (`at least half-full’). – Index pages can typically hold many more entries than leaf pages. – Variable sized records and search keys mean differnt nodes will contain different numbers of entries. – Even with fixed length fields, multiple records with the same search key value (duplicates) can lead to variable-sized data entries (if we use Alternative (3)). YV - B+ Trees and Physical Design 354

Summary n n Tree-structured indexes are ideal for range-searches, also good for equality searches. ISAM is a static structure. – Only leaf pages modified; overflow pages needed. – Overflow chains can degrade performance unless size of data set and data distribution stay constant. n B+ tree is a dynamic structure. – Inserts/deletes leave tree height-balanced; log F N cost. – High fanout (F) means depth rarely more than 3 or 4. – Almost always better than maintaining a sorted file. YV - B+ Trees and Physical Design 355

Summary (Contd. ) – Typically, 67% occupancy on average. – Usually preferable to ISAM, modulo locking considerations; adjusts to growth gracefully. – If data entries are data records, splits can change rids! n n n Key compression increases fanout, reduces height. Bulk loading can be much faster than repeated inserts for creating a B+ tree on a large data set. Most widely used index in database management systems because of its versatility. One of the most optimized components of a DBMS. YV - B+ Trees and Physical Design 356

Overview of Physical Design n After ER design, schema refinement, and the definition of views, we have the conceptual and external schemas for our database. The next step is to choose indexes, make clustering decisions, and to refine the conceptual and external schemas (if necessary) to meet performance goals. We must begin by understanding the workload: – The most important queries and how often they arise. – The most important updates and how often they arise. – The desired performance for these queries and updates. YV - B+ Trees and Physical Design 357

Understanding the Workload n For each query in the workload: – Which relations does it access? – Which attributes are retrieved? – Which attributes are involved in selection/join conditions? How selective are these conditions likely to be? n For each update in the workload: – Which attributes are involved in selection/join conditions? How selective are these conditions likely to be? – The type of update (INSERT/DELETE/UPDATE), and the attributes that are affected. YV - B+ Trees and Physical Design 358

Decisions to Make n What indexes should we create? – Which relations should have indexes? What field(s) should be the search key? Should we build several indexes? n For each index, what kind of an index should it be? – Clustered? Hash/tree? Dynamic/static? Dense/sparse? n Should we make changes to the conceptual schema? – Consider alternative normalized schemas? (Remember, there are many choices in decomposing into BCNF, etc. ) – Should we ``undo’’ some decomposition steps and settle for a lower normal form? (Denormalization. ) – Horizontal partitioning, replication, views. . . YV - B+ Trees and Physical Design 359

Choice of Indexes n n One approach: consider the most important queries in turn. Consider the best plan using the current indexes, and see if a better plan is possible with an additional index. If so, create it. Before creating an index, must also consider the impact on updates in the workload! – Trade-off: indexes can make queries go faster, updates slower. Require disk space, too. YV - B+ Trees and Physical Design 360

Issues to Consider in Index Selection n Attributes mentioned in a WHERE clause are candidates for index search keys. – Exact match condition suggests hash index. – Range query suggests tree index. » Clustering is especially useful for range queries, although it can help on equality queries as well in the presence of duplicates. n Try to choose indexes that benefit as many queries as possible. Since only one index can be clustered per relation, choose it based on important queries that would benefit the most from clustering. YV - B+ Trees and Physical Design 361

Issues in Index Selection (Contd. ) n Multi-attribute search keys should be considered when a WHERE clause contains several conditions. – If range selections are involved, order of attributes should be carefully chosen to match the range ordering. – Such indexes can sometimes enable index-only strategies for important queries. » For index-only strategies, clustering is not important! n When considering a join condition: – Hash index on inner is very good for Index Nested Loops. » Should be clustered if join column is not key for inner, and inner tuples need to be retrieved. – Clustered B+ tree on join column(s) good for Sort-Merge. YV - B+ Trees and Physical Design 362

Example n Hash index on D. dname supports ‘Toy’ selection. – Given this, index on D. dno is not needed. n n Hash index on E. dno allows us to get matching (inner) Emp tuples for each selected (outer) Dept tuple. What if WHERE included: ``. . . AND E. age=25’’ ? – Could retrieve Emp tuples using index on E. age, then join with Dept tuples satisfying dname selection. Comparable to strategy that used E. dno index. – So, if E. age index is already created, this query provides much less motivation for adding an E. dno index. SELECT E. ename, D. mgr FROM Emp E, Dept D WHERE D. dname=‘Toy’ AND YV - B+ Trees and Physical Design E. dno=D. dno 363

Examples of Clustering B+ tree index on E. age can be used to get qualifying tuples. – How selective is the condition? – Is the index clustered? Consider the GROUP BY query. – If many tuples have E. age > 10, using E. age index and sorting the retrieved tuples may be costly. – Clustered E. dno index may be better! Equality queries and duplicates: – Clustering on E. hobby helps! YV - B+ Trees and Physical Design SELECT E. dno FROM Emp E WHERE E. age>40 SELECT E. dno, COUNT FROM Emp E WHERE E. age>10 GROUP BY E. dno (*) SELECT E. dno FROM Emp E WHERE E. hobby=Stamps 364

Multi-Attribute Index Keys n To retrieve Emp records with age=30 AND sal=4000, an index on <age, sal> would be better than an index on age or an index on sal. – Such indexes also called composite or concatenated indexes. – Choice of index key orthogonal to clustering etc. n If condition is: 20<age<30 AND 3000<sal<5000: – Clustered tree index on <age, sal> or <sal, age> is best. n If condition is: age=30 AND 3000<sal<5000: – Clustered <age, sal> index much better than <sal, age> index! n Composite indexes are larger, updated more often. YV - B+ Trees and Physical Design 365

Summary n Database design consists of several tasks: requirements analysis, conceptual design, schema refinement, physical design and tuning. – In general, have to go back and forth between these tasks to refine a database design, and decisions in one task can influence the choices in another task. n Understanding the nature of the workload for the application, and the performance goals, is essential to developing a good design. – What are the important queries and updates? What attributes/relations are involved? YV - B+ Trees and Physical Design 366

Summary (Contd. ) n Indexes must be chosen to speed up important queries (and perhaps some updates!). – – Index maintenance overhead on updates to key fields. Choose indexes that can help many queries, if possible. Build indexes to support index-only strategies. Clustering is an important decision; only one index on a given relation can be clustered! – Order of fields in composite index key can be important. n n Static indexes may have to be periodically re-built. Statistics have to be periodically updated. YV - B+ Trees and Physical Design 367

Relation Implementations n (a) STORE EACH RELATION IN A FILE – For small relations, a heap may suffice – For larger relations, ISAM, B-tree, or Hashing – Allow for secondary indexes on user-specified fields Example Commands: modify R to isam on A 1 index on R is S(A 1) n (b) STORE EACH RELATION AS IN DBTG – Store related tuples from different relations together – Use multilist structures YV - B+ Trees and Physical Design 368