Indices What is an index Index an access

  • Slides: 8
Download presentation
Indices

Indices

What is an index? Index - an access structure that can provide a secondary

What is an index? Index - an access structure that can provide a secondary access path to the stored records They are used to speed queries by allowing records to be found quickly according to indexing fields. Example: ◦ ◦ ◦ CREATE TABLE students (first_name, last_name, grade); INSERT INTO. . . # Thousands of records SELECT * WHERE first_name == "Josh"; # Find records with first_name of "Josh" SELECT * WHERE first_name >= "J" AND first_name < "K"; # Find records in a range SELECT * ORDER BY grade; # Get the records in a sorted order Without indices, all of the SELECT operations require examining every record in the table. Making appropriate indices, can allow for huge speedups for many common queries.

Sorted and Unsorted Indices map search key fields to records. If an index is

Sorted and Unsorted Indices map search key fields to records. If an index is unsorted, you can use that index to find records matching the key by equality. (Later we'll be discussing using Hash Tables as unsorted indices. ) Sorted indices are more useful in that they can return records matching to key ranges and be used for ordering (sorting). http: //www. deviantart. com/art/THE-SORTING-HAT-43892091

Primary Index The primary index is the index that specifies the order of the

Primary Index The primary index is the index that specifies the order of the stored data. It uses the Primary Key as its ordering field, and thus requires a unique value for that field. For SQLite, if not specified, the Primary Key (PK) is "rowid", a hidden attribute that monotonically increases for each insertion. "rowid" represents insertion order. However, the Primary Key can be any field (or fields) that are unique for each record in the table. Every table must have a Primary Key (even if it is the implicit "rowid"). The primary index is just an index that has the Primary Key as the index's key. https: //www. sqlite. org/queryplanner. html

Dense Versus Sparse Indices Dense indices have an entry for every search key value

Dense Versus Sparse Indices Dense indices have an entry for every search key value (and hence every record) in the table. Sparse indices don't have an entry for every search key value. Example: An index to the first word of each starting letter in a dictionary. We'll be primarily concerned with dense indices for this course. Censored for your enjoyment. http: //memegenerator. net/instance/65130815

Secondary Index A secondary index provides a secondary means of accessing the data for

Secondary Index A secondary index provides a secondary means of accessing the data for which some primary access method already exists. They must be dense, as the order of the data isn't determined by the secondary index. students(rowid is PK) index using grade as ordering key rowid first_name last_name grade rowid 1 Harry Potter 0. 0 1 2 Hermione Granger 4. 0 1. 5 3 3 Ron Weasley 1. 5 2. 0 4 4 James Potter 2. 0 2. 7 6 5 Lily Potter 3. 8 5 6 Dolores Umbridge 2. 7 4. 0 2 It is easy to find the top 3 students if you have an index on grade. But what if you want students that have grades between 2. 0 and 3. 0?

Multilevel Indices Single level indices (like the example before) have some performance problems. The

Multilevel Indices Single level indices (like the example before) have some performance problems. The index can be large (too large to fit into memory) leading to expensive IO hits. If you are looking for a non-extreme value (not a maximum or minimum), you need to do binary search to find the value. Multi-level index (trees) provide a superior alternative. http: //www. tutorialspoint. com/dbms_indexing. htm

Binary Trees As Indices You can partly solve these problems by using binary search

Binary Trees As Indices You can partly solve these problems by using binary search trees. Problems: ◦ Binary trees have many internal nodes, meaning that you have to read log 2(N) nodes to find the key you want. ◦ Also, the next highest key (useful for key ranges) isn't stored in memory near the current key, meaning more IO operations. ◦ Binary trees are either unbalanced (meaning they can become very deep) or require expensive restructuring to assure it remains balanced. B+ Trees are a data structure tailored to allow for efficient lookup and lookup ranges without the problems associated with binary trees.