Query Processing and Optimization Block Accesses John A
Query Processing and Optimization (Block Accesses) John A. Miller
Creating the SQL Statement • Partial Bank Schema: • customer (cname, street, ccity) • deposit (accno, balance, cname, bname) • English: List the customers (names and cities) who have a deposit account at the Alps branch • select c. cname, c. ccity • from customer c, deposit d • where d. bname = ‘Alps’ and c. cname = d. cname
Steps in Query Processing 1. Write as a Relational Algebra Expression Tree • Direct Translation 2. Convert Cartesian Product to Join 3. Move Select Operations Down the Tree • Selectivity = 1 / V(A, r) where V(A, r) is number of distinct values 4. Apply Indexes to Speed-Up Joins and Selects
Performance Analysis • Cost Metrics: • Sum of all output sizes (number of tuples) • • • Tuple oriented, rough estimate Sum of all block accesses required to process query • Block oriented, better estimate for secondary storage • Number of block accesses (nba) Run Time • Empirical, e. g. , Project 3
Steps in Query Processing Compare Number of Block Accesses | n c = 1000 tuples in customer (c) table ∏ c. cname, c. ccity | nd = 2000 tuples in deposit (d) table, V(bname, d) = 10 distinct values of bname in d σ bname = ‘Alps’ and c. cname = d. cname sc = 400 => bfc = 10 => nbc = 100 | s d = 400 => bfd = 10 => nbd = 200 x s x = 800 => bfx = 5 / customer deposit (nc = 1000) (nd = 2000) nba = sum of block access for x, σ and ∏
Steps in Query Processing Compare Number of Block Accesses | ∏ c. cname, c. ccity skip the project calculation | σ bname = ‘Alps’ and c. cname = d. cname writes = nc * nd / (V(bname, d) * nc) / bfx = 200 / 5 = 40 | reads = 400, 000 x writes = (n c * nd) / bfx = (1000 * 2000) / 5 = 400, 000 / reads = nb c + nbc * nbd = 100 + 100 * 200 = 20100 customer deposit Direct Translation: (nc = 1000) (nd = 2000) nba = 20, 100 + 400, 000 + 40 = 820, 140
Steps in Query Processing Compare Number of Block Accesses | ∏ c. cname, c. ccity skip | σ bname = ‘Alps’ writes = nd / (V(bname, d) / bfx = 200 / 5 = 40 | reads = 400 * writes = nd / bfx = 2000 / 5 = 400 / customer deposit (nc = 1000) (nd = 2000) reads = nb c + nbc * nbd = 100 + 100 * 200 = 20100 Convert product to join nba = 20100 + 400 + 40 = 20940
Steps in Query Processing Compare Number of Block Accesses | ∏ c. cname, c. ccity skip | * writes = nd / V(bname, d) / bfx = 200 / 5 = 40 / / reads = 20 + 20 * 100 = 2020 σ bname = ‘Alps’ / customer deposit (nc = 1000) (nd = 2000) writes = nd / V(bname, d) / bfd = 200 / 10 = 20 reads = nb d = 200 Move selects down the tree nba = 200 + 2020 + 40 = 2280
Steps in Query Processing Compare Number of Block Accesses Select: Index on cname in customer - Unique Index nba = h(nc) + 2 = h(1000) + 2 = 3 + 2 = 5 Select: Index on bname in deposit - Non-Unique Index nba = h(V(bname, d)) + 2 + writes = h(10) + 20 = 1 + 20 = 23 nba = Internal nodes + leaf node + pointer bucket + data blocks
Steps in Query Processing Compare Number of Block Accesses | ∏ c. cname, c. ccity skip | Non-Unique Index * writes = 40 / / reads = 20 + 20 * 100 = 2020 σ bname = ‘Alps’ / customer deposit (nc = 1000) (nd = 2000) writes = 20 reads = h(V(bname, d)) + 2 + writes = 23 Apply Indexes - bname on deposit nba = 23 + 2020 + 40 = 2103
Steps in Query Processing Compare Number of Block Accesses | ∏ c. cname, c. ccity skip | Unique Index * writes = 40 / / reads = 20 + 200 (h(n c) + 2) = 20 + 200 (5) = 1020 σ bname = ‘Alps’ / customer deposit (nc = 1000) (nd = 2000) writes = 20 reads = h(V(bname, d)) + 2 + writes = 23 Apply Indexes - cname on customer nba = 23 + 20 + 1020 + 40 = 1103
- Slides: 11