Database System Architecture and Implementation Query Evaluation Overview
Database System Architecture and Implementation Query Evaluation Overview 1
Orientation Web Forms Applications SQL Interface SQL Commands DBMS re! e h e e ar Executor Parser Operator Evaluator Optimizer W Transaction Manager Lock Manager Files and Index Structures Buffer Manager Recovery Manager Disk Space Manager Index and Data Files Catalog Database 2 Figure Credit: Raghu Ramakrishnan and Johannes Gehrke: “Database Management Systems”, Mc. Graw-Hill, 2003.
Outline • System Catalog • Join/Count as running example • Selection – Scan, index – When can we use an index? – One slide on cost • • • Selectivity Project – one slide – sorting/hashing Join – one slide summary Set operations – one slide summary Aggregates – a few slides 3
System Catalog • System catalog stores system-wide information – size of buffer pool, the page size, etc. – information about tables, views, and indexes. Information stored in the system catalog • Table metadata − table name, file name (or some identifier), file structure (e. g. , heap file) − attribute name and type of each attribute of the table − index name of each index on the table − integrity constrains (e. g. , primary and foreign key constraints) on the table • Index metadata − index name and structure (e. g. , B+ tree) − search key attributes • View metadata − view name and definition Slides Credit: Michael Grossniklaus – Uni-Konstanz 4
System Catalog • Additionally, the system catalog stores – statistics about tables and indexes – statistics are updated periodically, not every time tables are modified Statistics • Table statistics − cardinality: number of tuples NTuples(R) for each table R − size: number of pages NPages(R) for each table R • Index statistics − cardinality: number of distinct key values NKeys(I) for each index I − size: number of pages INPages(I) for each index (for a tree index I, INPages(I) denotes the number of leaf pages) − height: number of non-leaf levels for each tree index I − range: minimum present key value ILow(I) and the maximum present key value IHigh(I) for each index I Slides Credit: Michael Grossniklaus – Uni-Konstanz 5
System Catalog • DBMS stores system catalog itself as a collection of tables – existing techniques for implementing and managing tables are reused – same query language can be used for catalog tables as for other tables 6
Running Example A simple schema CREATE TABLE Sailors ( sid INTEGER, sname STRING, rating INTEGER, age REAL, PRIMARY KEY (sid) ) CREATE Table Reserves ( sid INTEGER, bid INTEGER, day DATE, rname STRING, PRIMARY KEY (sid, bid, day), FOREIGN KEY sid REFERENCES Sailors(sid) ) • Assumptions – relation Sailors: 50 bytes/tuple, 80 tuples/page, and 500 pages – relation Reserves: 40 bytes/tuple, 100 tuples/page, and 1000 pages Ä 4 k. B page size, 2 MB Sailors data, and 4 MB Reserves data Slides Credit: Michael Grossniklaus – Uni-Konstanz 7
System Catalog Example Tables name Tables Attributes Views Indexes Sailors Reserves Views name Captains Indexes name Boats file #tuples size Attributes … 6 1 … 23 1 … 1 1 … 0 1 … 40, 000 500 … 100, 000 1000 text SELECT * FROM Sailors WHERE… file type #keys size … B+Tree 100 1 name file #tuples size name table type pos ⁞ sid sname rating age sid bid day rname Slides Credit: Michael Grossniklaus – Uni-Konstanz table Tables Attributes ⁞ Sailors Reserves type pos string 1 string 2 integer 3 integer 4 string 1 string 2 string 3 integer 4 ⁞ ⁞ integer 1 string 2 integer 3 real 4 integer 1 integer 2 date 3 string 4 8
Relational Operator Evaluation • Each operator (select, join, group by) has several implementations – for most operators no implementation has universally superior performance – performance depends on sizes of involved tables, existing indexes and sort orders • Alternative implementations: – Select: Scan table, use clustered index, use unclustered index – Join: Hash Join, Index Nested Loops Join, Sort-Merge Join – Group By (a. k. a. Aggregate): Usually done with hashing, can be done with sorting also 9
Running Example A simple schema CREATE TABLE Students( sid INTEGER, sname STRING, age INTEGER, PRIMARY KEY (sid) ) CREATE Table S 14 Classes( cid INTEGER, cname STRING, profname STRING, dept STRING, PRIMARY KEY (cid) ) CREATE TABLE Is. Taking ( sid INTEGER, cid INTEGER, FOREIGN KEY sid REFERENCES Students, FOREIGN KEY cid REFERENCES Classes) 10
Running Example Query #1 SELECT S. sname, C. cid FROM Students S, Is. Taking I WHERE S. sid = I. sid AND S. age < 50 AND C. cid in (487, 587) JOIN cid = cid Scan, Index Select? Is index clustered? SELECT age < 50 Students Hash Join? Sort. Merge Join? Index Nested Loops? SELECT cid in (487, 587) Is. Taking Scan, Index Select? 11
Running Example Query #1 SELECT cid, count(*) FROM Is. Taking I GROUP BY cid COUNT GB cid SCAN Is. Taking Almost always hashing… Could be sorting… Scan 12
What we will learn… • Implementation options for select, join, group by/aggregate operators – Implementation algorithms (hash join, index nested-loop join, …) • Query evaluation architecture – Internal operator implementation – How operators communicate • How to pick which implementations to use in a query (a. k. a. Optimization) – Costs of accessing data (index, scan, etc. ) and costs of different implementation algorithms 13
Select Operation • Option 1: Scan the table • Always works, but is expensive • Option 2: Use an index • Works if the index “matches” the predicate / selection condition / where clause • Often cheaper than scanning a table, but not always JOIN sid = sid SELECT age < 50 Students SELECT cid in (487, 587) Is. Taking 14
When can we use an index? • Notion of “index matches a predicate” • Basically mean when can an index be used to evaluate predicates in the query 11/26/2020 Slides Credit: Jignesh Patel – University of Wisconsin 15
Index Matching • B+-tree index on <a, b, c> 1. a=5 and b= 3? 2. a > 5 and b < 3 3. b=3 4. a=7 and b=5 and c=4 and d>4 5. a=7 and c=5 n 1. yes 2. yes 3. no! 4. yes (primary conjunct) 5. yes Hash Idx 1. 2. 3. 4. no! no! yes 5. no Index matches (part of) a predicate 1. Conjunction of terms involving only attributes (no disjuctions) 2. Hash: only equality operation, predicate has all index attributes. 3. Tree: Attributes are a prefix of the search key, any ops. 11/26/2020 Slides Credit: Jignesh Patel – University of Wisconsin 16
Index Usage Examples ! B+ tree index – single value index and multi-value index Assume a B+ tree index I on the search key� cid�on table S 14 Classes • cid = 587 AND dept = ‘CS’ • dept = ‘Joe’ OR cid = 587 Assume a B+ tree index I on the search key � dept, cid�on table S 14 Classes • dept = ‘CS’ AND cid = 587 • dept = ‘CS’ • cid = 587 Slides Credit: Michael Grossniklaus – Uni-Konstanz 17
Index Usage – Additional Predicates Example: additional predicates Index Matches Exactly: Assume an B+ tree index on the search key � cid�on table S 14 Classes Given the selection condition: cid = 3 1. Scan the index to retrieve the tuples that satisfy the condition cid = 3 Additional Predicate Required: Given the selection condition: cid = 5 AND dept = ‘CS’ 1. Scan the index to retrieve the tuples that satisfy the condition sid = 5 2. The additional predicate on dept must then be applied to each retrieved tuple. 18
Index Usage – Partial Match Example: two (partially) matching indexes Assume an B+tree index on the search key� cid� and a B+ tree index on dept Selection Condition: dept = ‘CS’ AND cid > 300 Question: Which index would you use? 19
Index Usage – Partial Match Example: two (partially) matching indexes Assume an B+tree index on the search key� cid� and a B+ tree index on dept Selection Condition: dept = ‘CS’ AND cid > 300 Option 1: Pick an index, the predicate in the selection condition that are not matched by the index must be checked for each retrieved tuple • Use selectivity to pick which index to use Option 2: Use both indices, intersect RID sets • Sort RIDs, retrieve RIDs in both sets • Side-effect: tuples retrieved in the order on disk 20
Select Operation - Cost • Option 1: Scan the table • Disk I/O cost O(N) (N is # pages in relation) • Option 2: Use an index • Disk I/O Cost O(log. FN) + X = Index Height + X • X = # selected tuples / # tuples per page (clustered index) • X = # selected tuples (unclustered index) Note: really I only think about X – Index has low height and may be in memory… 21
Selectivity Definition: selectivity of an predicate • the selectivity of a predicate is the percent of tuples it retrieves • the most selective predicate is the one that retrieves the fewest tuples • Clustered Index Cost = • (selectivity) * # tuples / # tuples per page • Unclustered Index Cost = • (selectivity) * # tuples SELECT age < 50 SELECT cid in (487, 587) 22
Selectivity – Assumptions • the fraction of tuples in the table that satisfy a given conjunct is called the reduction factor. • if there are several primary conjuncts, the fraction that satisfies all of them can be approximated by the product of their reduction factors. • this approach assumes that conjuncts are independent • the conjuncts may not be independent, but this approximation is widely used in practice Selection Condition: cid between 580 and 590 AND cname like ‘%Database%’ • systems also assume uniform distribution for range queries (index can give you max/min) 23
Access Path Selectivity ! Exercise: partially matching indexes How would you estimate the fraction of tuples that satisfy a selection condition, if no index matches the entire condition? Slides Credit: Michael Grossniklaus – Uni-Konstanz 24
Access Path Selectivity ! Exercise: partially matching indexes How would you estimate the fraction of tuples that satisfy a selection condition, if no index matches the entire condition? Ä use catalog information on partially matching indexes and approximate fraction of satisfying tuples by multiplying these reduction factors Ä assume a default value for the reduction factor, typically 1/10 Ä if available use statistics (e. g. , histograms) about distribution of attribute values Slides Credit: Michael Grossniklaus – Uni-Konstanz 25
When to use a B+tree index • Consider – A relation with 1 M tuples – 100 tuples on a page – 500 (key, rid) pairs on a page # data pages = 1 M/100 = 10 K pages # leaf idx pgs = 1 M / (500 * 0. 67) ~ 3 K pages 1% Selection 10% Selection 30 + 100 300 + 1000 Non-Clustered 30 + 10, 000 300 + 100, 000 NC + Sort Rids 30 + (~ 10, 000) 300 + (~ 10, 000) Clustered ð Choice of Index access plan, consider: 1. Index Selectivity 2. Clustering ð 11/26/2020 Similar consideration for hash based indices Slide Credit: Jignesh Patel, University of Wisconsin 26
Relational Operator Evaluation • Focus on σ, π, and ⋈ operator as used in so-called SPJ queries – this week: brief overview of evaluation algorithms for these operators – Next week: in-depth look at sorting and operator evaluation • Cost analysis – as before, only I/O costs are considered – I/O costs are measured in terms of number of page I/O operations – examples, rather than rigorous cost formulas Slides Credit: Michael Grossniklaus – Uni-Konstanz 27
Select Operator Implementation • Select Implementation (i. e. WHERE cid = 587) – Scan: if there is no index on cid, table has to be scanned – Index: if one or more indexes match the predicate, an index can be used • Clustered or un-clustered – Index-only plans A rule of thumb If over 10% of the tuples of a table are to be retrieved, it is likely to be cheaper to simply scan the entire table (instead of using an unclustered index) Slides Credit: Michael Grossniklaus – Uni-Konstanz 28
Projection Algorithm • Used to project the selected attributes. Simple case: Example SELECT R. a, R. d. – Algorithm: for each tuple, only output R. a, R. d Harder case: DISTINCT clause • Example: SELECT DISTINCT R. a, R. d – Remove attributes and eliminate duplicates • Algorithms – Sorting: Sort on all the projection attributes • Pass 0: eliminate unwanted fields. Tuples in the sorted-runs may be smaller • Eliminate duplicates in the merge pass & in-memory sort – Hashing: Two phases • Partitioning • Duplicate elimination 11/26/2020 Slides Credit: Jignesh Patel – University of Wisconsin 29
Join • Joins are both common and expensive, they have been widely studied • DBMS typically supports several algorithms to compute joins – Index Nested Loops Join – Hash Join – Sort-Merge Join JOIN sid = sid SELECT age < 50 Students SELECT cid in (487, 587) Is. Taking 30
Other Operators • In addition to the basic relational operators (σ, π, and ⋈), SQL queries can also contain – group by / aggregation – set operations: union, difference, and intersection • Set operations – Sorting or hashing, duplicate elimination is expensive (Union) • Group by / Aggregate – typically implemented through hashing (possibly sorting) Slides Credit: Michael Grossniklaus – Uni-Konstanz 31
Set Operations • ∩ and Ⅹ special cases of join • ∪ and − similar; we’ll do ∪. – Duplicate elimination • Sorting: – – – Sort both relations (on all attributes). Merge sorted relations eliminating duplicates. Alternative: Merge sorted runs from both relations. • Hashing: – – – Partition R and S Build hash table for Ri. Probe with tuples in Si, add to table if not a duplicate 11/26/2020 Slides Credit: Jignesh Patel – University of Wisconsin 32
Aggregates - Example Query #1 SELECT cid, count(*) FROM Is. Taking I GROUP BY cid WHERE cid > 500 COUNT GB cid SELECT cid > 500 Is. Taking 33
Aggregates • Sorting – Sort on group by attributes (if any) – Scan sorted tuples, computing running aggregate • Max: Max • Average: Sum, Count – If the group by attribute changes, output aggregate result • Cost: sorting cost COUNT GB cid SELECT cid > 500 Is. Taking 11/26/2020 Slides Credit: Jignesh Patel – University of Wisconsin 34
Aggregates • Hashing – Hash on group by attributes (if any) • Hash entry: group attributes + running aggregate – Scan tuples, probe hash table, update hash entry – Scan hash table, and output each hash entry • Cost: Scan relation! • What if we have a large # groups? COUNT GB cid SELECT cid > 500 Is. Taking 11/26/2020 Slides Credit: Jignesh Patel – University of Wisconsin 35
Aggregates • Index – Without Grouping • Can use B+tree on aggregate attribute(s) • Where clause? – With grouping • B+tree on all attributes in SELECT, WHERE and GROUP BY clauses – Index-only scan – If group-by attributes prefix of search key => data entries/tuples retrieved in group-by order – Else => get data entries and then use a sort or hash aggregate algorithm 11/26/2020 Slides Credit: Jignesh Patel – University of Wisconsin 36
- Slides: 36