SQL Structured Query Language Sequel Chapter 5 Part

  • Slides: 36
Download presentation
SQL: Structured Query Language (‘Sequel’) Chapter 5 Part 2. 1

SQL: Structured Query Language (‘Sequel’) Chapter 5 Part 2. 1

Running Example v Instances of the Sailors and Reserves relations in our examples. R

Running Example v Instances of the Sailors and Reserves relations in our examples. R 1 S 2 2

Aggregation and Having Clauses 3

Aggregation and Having Clauses 3

Aggregate Operators v Significant extension of relational algebra. COUNT (*) COUNT ( [DISTINCT] A)

Aggregate Operators v Significant extension of relational algebra. COUNT (*) COUNT ( [DISTINCT] A) SUM ( [DISTINCT] A) AVG ( [DISTINCT] A) MAX (A) MIN (A) Why no Distinct? 4

Aggregate Operators COUNT (*) COUNT ( [DISTINCT] A) SUM ( [DISTINCT] A) AVG (

Aggregate Operators COUNT (*) COUNT ( [DISTINCT] A) SUM ( [DISTINCT] A) AVG ( [DISTINCT] A) MAX (A) MIN (A) SELECT COUNT (*) FROM Sailors S SELECT AVG (S. age) FROM Sailors S WHERE S. rating=10 SELECT COUNT (DISTINCT FROM Sailors S WHERE S. sname=‘Bob’ S. rating) 5

Find name and age of the oldest sailor(s) SELECT S. sname, MAX FROM Sailors

Find name and age of the oldest sailor(s) SELECT S. sname, MAX FROM Sailors S (S. age) • What does this query do ? Is this query legal? • No! Why not ? SELECT S. sname, S. age FROM Sailors S WHERE S. age = (SELECT MAX (S 2. age) FROM Sailors S 2) What does this query do ? Is this query legal? 7

Find name and age of the oldest sailor(s) SELECT S. sname, S. age FROM

Find name and age of the oldest sailor(s) SELECT S. sname, S. age FROM Sailors S WHERE S. age = (SELECT MAX (S 2. age) FROM Sailors S 2) SELECT S. sname, S. age FROM Sailors S WHERE (SELECT MAX (S 2. age) FROM Sailors S 2) = S. age v example queries equivalent in SQL/92 v but third NOT supported in some systems 8

Motivation for Grouping v v So far, aggregate operators to all (qualifying) tuples. Question:

Motivation for Grouping v v So far, aggregate operators to all (qualifying) tuples. Question: § What if want to apply aggregate to each group of tuples ? v Example : § Find the age of the youngest sailor for each rating level. v Example Procedure : § Suppose rating values {1, 2, …, 10}, 10 queries: For i = 1, 2, . . . , 10: SELECT MIN (S. age) FROM Sailors S WHERE S. rating = i 9

Motivation for Grouping For i = 1, 2, . . . , 10: §

Motivation for Grouping For i = 1, 2, . . . , 10: § SELECT MIN (S. age) FROM Sailors S WHERE S. rating = i What are the problems with above ? • We may not know how many rating levels exist. • Nor what the rating values for these levels are. • Performance issue (why ? ) 10

Add Group By to SQL 11

Add Group By to SQL 11

Queries With GROUP BY and HAVING SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP

Queries With GROUP BY and HAVING SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list HAVING group-qualification v v A group is a set of tuples that each have the same value for all attributes in grouping-list. HAVING clause is a restriction on each group. 12

Are Group By queries valid or not ? SELECT avg ( S. salary) FROM

Are Group By queries valid or not ? SELECT avg ( S. salary) FROM Sailors S GROUP BY S. rating SELECT S. name FROM Sailors S GROUP BY S. rating 13

Queries With GROUP BY and HAVING SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP

Queries With GROUP BY and HAVING SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list HAVING group-qualification v A group is a set of tuples that each have the same value for all attributes in grouping-list. v target-list contains : (i) attribute names (ii) aggregate-op (column-name) v (e. g. , MIN (S. age)). REQUIREMENT: - target-list (i) grouping-list. - Why? - Each answer tuple of a group must have single value. 14

Is Query Valid ? SELECT S. rating, MIN (S. age) AS FROM Sailors S

Is Query Valid ? SELECT S. rating, MIN (S. age) AS FROM Sailors S WHERE S. age >= 18 GROUP BY S. rating HAVING COUNT (*) > 1 min-age What does query below mean ? Find age of the youngest sailor with age 18, for each rating with at least 2 such sailors that are over 18. 15

Group. By --- Conceptual Evaluation 1. 2. 3. 4. 5. 6. Compute the cross-product

Group. By --- Conceptual Evaluation 1. 2. 3. 4. 5. 6. Compute the cross-product of relation-list (From) Discard tuples that fail qualification (Where) Delete `unnecessary’ fields Partition the remaining tuples into groups by the value of attributes in grouping-list. (Group. By) Eliminate groups using the group-qualification (Having) Apply selection to each group to produce output tuple (Select) We want to have a single value per group, that is, one answer tuple is generated per qualifying group. 16

Find age of the youngest sailor with age 18, for each rating with at

Find age of the youngest sailor with age 18, for each rating with at least 2 such sailors S. rating, MIN (S. age) AS min-age FROM Sailors S WHERE S. age >= 18 GROUP BY S. rating HAVING COUNT (*) > 1 SELECT Sailors instance: 17

Find age of the youngest sailor with age 18, for each rating with at

Find age of the youngest sailor with age 18, for each rating with at least 2 such sailors. 18

Again: Find age of youngest sailor with age 18, for each rating with at

Again: Find age of youngest sailor with age 18, for each rating with at least 2 such sailors S. rating, MIN (S. age) AS min-age FROM Sailors S WHERE S. age >= 18 GROUP BY S. rating HAVING COUNT (*) > 1 SELECT Now: Find age of the youngest sailor with age 18, for each rating with at least 2 such sailors and with every sailor under 60. Options: 1. Put 60 age condition into WHERE clause ? 2. Put 60 age condition into HAVING clause ? 19

Find age of the youngest sailor with age 18, for each rating with at

Find age of the youngest sailor with age 18, for each rating with at least 2 such sailors and with every sailor under 60. HAVING COUNT (*) > 1 AND EVERY (S. age <=60) EVERY : Must hold for all tuples in the group. 21

Find age of the youngest sailor with age 18, for each rating with at

Find age of the youngest sailor with age 18, for each rating with at least 2 sailors between 18 and 60. S. rating, MIN (S. age) AS min-age FROM Sailors S WHERE S. age >= 18 ? ? ? GROUP BY S. rating HAVING COUNT (*) > 1 ? ? ? SELECT Sailors instance: Now check age<=60 before making groups. 22

Find age of the youngest sailor with age 18, for each rating with at

Find age of the youngest sailor with age 18, for each rating with at least 2 sailors between 18 and 60. S. rating, MIN (S. age) AS min-age FROM Sailors S WHERE S. age >= 18 AND S. age <= 60 GROUP BY S. rating HAVING COUNT (*) > 1 SELECT Sailors instance: Answer relation: Check age<=60 before making groups. 23

Join, Group. By and Nesting? 24

Join, Group. By and Nesting? 24

For each red boat, find the number of reservations for this boat Sailors :

For each red boat, find the number of reservations for this boat Sailors : sid, name, … Boats : bid, color, … Reserves: sid, bid, day 25

For each red boat, find the number of reservations for this boat SELECT B.

For each red boat, find the number of reservations for this boat SELECT B. bid, COUNT (*) AS s-count FROM Sailors S, Boats B, Reserves R WHERE S. sid=R. sid AND R. bid=B. bid AND GROUP BY B. bid v B. color=‘red’ Grouping over Join of three relations. 26

For each red boat, find the number of reservations for this boat Q: What

For each red boat, find the number of reservations for this boat Q: What if we move B. color=‘red’ from WHERE to HAVING? SELECT B. bid, COUNT (*) AS s-count FROM Sailors S, Boats B, Reserves R WHERE S. sid=R. sid AND R. bid=B. bid GROUP BY B. bid HAVING (B. color=‘red’) • Illegal. • Only column in Group. By can appear in Having clause, unless in aggregate operator of Having over group; • E. g. , HAVING count (B. color = ‘red’ ) > 1; 27

Find age of the youngest sailor with age > 18, for each rating with

Find age of the youngest sailor with age > 18, for each rating with at least 2 sailors (of any age) v Hint : HAVING clause can also contain a subquery. SELECT S. rating, MIN (S. age) FROM Sailors S WHERE S. age > 18 GROUP BY S. rating HAVING 1 < (SELECT COUNT (*) FROM Sailors S 2 WHERE S. rating=S 2. rating) 28

Find age of the youngest sailor, for each rating with at least 2 sailors

Find age of the youngest sailor, for each rating with at least 2 sailors of age 18; SELECT S. rating, MIN (S. age) FROM Sailors S GROUP BY S. rating HAVING 1 < (SELECT COUNT (*) FROM Sailors S 2 WHERE S 2. age > 18 and S. rating=S 2. rating) 29

Find those ratings for which the average is the minimum over all ratings SELECT

Find those ratings for which the average is the minimum over all ratings SELECT S. rating FROM Sailors S WHERE S. age = (SELECT MIN (AVG (S 2. age)) FROM Sailors S 2) v Above query has a problem ! What ? § Aggregate operations cannot be nested! 30

Find those ratings for which the average is the minimum over all ratings v

Find those ratings for which the average is the minimum over all ratings v Correct solution (in SQL/92): SELECT Temp. rating, Temp. avg-age FROM (SELECT S. rating, AVG (S. age) AS avg-age FROM Sailors S GROUP BY S. rating) AS Temp WHERE Temp. avg-age = (SELECT MIN (Temp. avg-age) FROM Temp) 31

Outer Joins : Special Operators v v v Left Outer Join Right Outer Join

Outer Joins : Special Operators v v v Left Outer Join Right Outer Join Full Outer Join SELECT S. sid, R. bid FROM Sailors S LEFT OUTER JOIN Reserves R WHERE S. sid = R. sid Sailors rows (left) without a matching Reserves row (right) appear in result, but not vice versa. SELECT S. sid, R. bid FROM Sailors S NATURAL LEFT OUTER JOIN Reserves R 32

Null Values 33

Null Values 33

Null Values v Field values in a tuple are sometimes : § Unknown (e.

Null Values v Field values in a tuple are sometimes : § Unknown (e. g. , a rating has not been assigned) or § Inapplicable (e. g. , no maiden-name when a male person) v SQL provides a special value null for such situations. v The presence of null complicates many issues. 34

Allowing Null Values § SQL special operators IS NULL or IS NOT NULL to

Allowing Null Values § SQL special operators IS NULL or IS NOT NULL to check if value is/is not null. § Disallow NULL value : rating INTEGER NOT NULL § We need a 3 -valued logic : condition can be true, false or unknown. 35

Working with NULL values v Question : § § Given predicate (S. rating =

Working with NULL values v Question : § § Given predicate (S. rating = 8). Is it TRUE or FALSE ? ? What if rating has null value in tuple ? v Comparison operators on NULL return UNKNOWN v Question : § § v Given arithmetic expression (S. rating + 8). What is its value? What if that tuple’s rating has a null value? Arithmetic operations on NULL return NULL. 36

Truth table with UNKNOWN In general, a WHERE clause is satisfied only when it

Truth table with UNKNOWN In general, a WHERE clause is satisfied only when it evaluates to TRUE. UNKNOWN AND TRUE = UNKNOWN OR TRUE = TRUE UNKNOWN AND FALSE = FALSE UNKNOWN OR FALSE = UNKNOWN AND UNKNOWN = UNKNOWN OR UNKNOWN = UNKNOWN NOT UNKNOWN = UNKNOWN 37

Summary v SQL was important factor in early acceptance of relational model : easy-to-understand

Summary v SQL was important factor in early acceptance of relational model : easy-to-understand ! v Relationally complete: even more expressive power than relational algebra. v Many alternative ways to write a query. So optimizer must look for most efficient evaluation plan. v In practice, users may (still) want to be aware of how queries are optimized and evaluated for best results. 38