Clustered vs Unclustered Index CLUSTERED Index entries direct
- Slides: 9
Clustered vs. Unclustered Index CLUSTERED Index entries direct search for data entries Data entries UNCLUSTERED Data entries (Index File) (Data file) Data Records 1
B+ Tree Indexes Non-leaf Pages Leaf Pages (Sorted by search key) v Index leaf pages contain data entries, and are chained (prev & next) v Index non-leaf pages have index entries; only used to direct searches: index entry P 0 K 1 P 1 K 2 P 2 K m Pm 2
Example B+ Tree Note how data entries in leaf level are sorted Root 17 Entries < 17 5 2* 3* v v Entries >= 17 27 13 5* 7* 8* 14* 16* 22* 24* 30 27* 29* 33* 34* 38* 39* Find: 29*? 28*? All > 15* and < 30* Insert/delete: Find data entry in leaf, then change it. 3
Cost Model for Our Analysis v Notes: § § We ignore CPU costs, for simplicity. Measuring number of page I/Os ignores gains of prefetching a sequence of pages Thus even I/O cost is only approximated. Average-case analysis; based on simplistic assumptions. * Good enough to show overall trends! 4
Cost Model for Our Analysis Variables : § § B: The number of data pages R: Number of records per page 5
Comparing File Organizations v Heap files (random order; insert at eof) v Sorted files, sorted on <age, sal> v Clustered B+ tree file, search key <age, sal> v Heap file with unclustered B + tree index on search key <age, sal> v Heap file with unclustered hash index on search key <age, sal> 6
Operations to Compare v v v Scan: Fetch all records from disk Equality search Range selection Insert a record Delete a record 7
Assumptions in Our Analysis v Heap Files: § v Sorted Files: § v Equality selection on key; exactly one match. Files compacted after deletions. Indexes: § data entry size/pointers = 10% size of data record § Hash: No overflow buckets. • § Tree: 67% occupancy (this is typical). • v 80% page occupancy => “File size = 1. 25 data size” “Implies file size = 1. 5 data size” Scans: § Leaf levels of a tree-index are chained. § Index data-entries plus actual file scanned for unclustered indexes. v Range searches: § We use tree indexes to restrict set of data records fetched, but ignore hash indexes. 8
Cost of Operations * Several assumptions underlie these (rough) estimates! 9
- Clustered index và non clustered index
- Unclustered
- Unclustered b+ tree
- Clustered b+ tree index cost
- Why is folk culture clustered?
- A group of islands clustered together are called
- Tree based indexing in dbms
- Error bar chart spss
- Disadvantages of nucleated settlement pattern
- The basic systems and services of a city