Schema for Example Tables sid sname rating age
Schema for Example Tables ( sid, sname, rating, age ) � Boats: ( bid, color, bname ) � Reserves: ( sid, bid, date ) � Sailors:
ﻣﺜﺎﻝ ﻧﻤﻮﻧﻪ ﺭﺍﺑﻄﻪ and “Reserves” S 1 relations for our examples. � We’ll use positional or named field notation, assume that names of fields in query results are S 2 `inherited’ from names of fields in query input relations. � “Sailors” R 1
Joins � Condition � Result � Fewer Join: schema same as that of cross-product. tuples than cross-product, might be able to compute more efficiently � Sometimes called a theta-join.
Joins � Equi-Join: A special case of condition join where the condition c contains only equalities. � Result schema similar to cross-product, but only one copy of fields for which equality is specified. � Natural Join: Equijoin on all common fields.
Division � Not supported as a primitive operator, but useful for expressing queries like: Find sailors who have reserved all boats. � Let A have 2 fields, x and y; B have only field y: ◦ ◦ ◦ A/B = i. e. , A/B contains all x tuples (sailors) such that for every y tuple (boat) in B, there is an xy tuple in A. Or: If the set of y values (boats) associated with an x value (sailor) in A contains all y values in B, the x value is in A/B. general, x and y can be any lists of fields; y is the list of fields in B, and x y is the list of fields of A. ` � In
Examples of Division A/B B 1 B 2 B 3 A A/B 1 A/B 2 A/B 3
Expressing A/B Using Basic Operators � Division is not essential op; just a useful shorthand. ◦ (Also true of joins, but joins are so common that systems implement joins specially. ) � Idea: For A/B, compute all x values that are not `disqualified’ by some y value in B. ◦ x value is disqualified if by attaching y value from B, we obtain an xy tuple that is not in A. Disqualified x values: A/B: all disqualified tuples
Schema for Example Tables ( sid, sname, rating, age ) � Boats: ( bid, color, bname ) � Reserves: ( sid, bid, date ) � Sailors:
Find names of sailors who’ve reserved boat #103 � Solution 1: v Same: v Solution 2:
Find names of sailors who’ve reserved a red boat � Information about boat color only available in Boats; so need an extra join: v A more efficient solution: A query optimizer can find this, given the first solution!
� Can Find sailors who’ve reserved a red or a green boat identify all red or green boats, then find sailors who’ve reserved one of these boats: v Can also define Tempboats using union! (How? ) v What happens if is replaced by in this query?
Find sailors who’ve reserved a red and a green boat � Previous approach won’t work! Must identify sailors who’ve reserved red boats, sailors who’ve reserved green boats, then find the intersection (note that sid is a key for Sailors):
Find the names of sailors who’ve reserved all boats � Uses division; schemas of the input relations to / must be carefully chosen: v To find sailors who’ve reserved all ‘Interlake’ boats: . . . v What’s wrong if Tempsids=
Summary � The relational model has rigorously defined query languages that are simple and powerful. � Relational algebra is more operational; useful as internal representation for query evaluation plans. � Several ways of expressing a given query; a query optimizer should choose the most efficient version.
Select Operation – Example n Relation r ¡ A=B ^ D > 5 (r) A B C D 1 7 5 7 12 3 23 10 A B C D 1 7 23 10
Project Operation – Example Relation r: A, C (r) A B C 10 1 20 1 30 1 40 2 A C 1 1 1 2 2 =
Union Operation – Example Relations r, s: A B 1 2 2 3 1 s r n r s: A B 1 2 1 3
Set Difference Operation – Example Relations r, s: A B 1 2 2 3 1 s r n r – s: A B 1 1
Cartesian-Product Operation – Example Relations r, s: A B C D E 1 2 10 10 20 10 a a b b r s r x s: A B C D E 1 1 2 2 10 10 20 10 a a b b
Composition of Operations � � Can build expressions using multiple operations Example: A=C(r x s) rxs A=C(r x s) A B C D E 1 1 2 2 10 10 20 10 a a b b A B C D E 1 2 2 10 20 a a b
Banking Example branch (branch_name, branch_city, assets) customer (customer_name, customer_street, customer_city) account (account_number, branch_name, balance) loan (loan_number, branch_name, amount) depositor (customer_name, account_number) borrower (customer_name, loan_number)
Example Queries � Find all loans of over $1200 amount > 1200 (loan) n Find the loan number for each loan of an amount greater than $1200 loan_number ( amount > 1200 (loan))
� Find Example Queries the names of all customers who have a loan, an account, or both, from the bank customer_name (borrower) customer_name (depositor) n Find the names of all customers who have a loan and an account at bank. customer_name (borrower) customer_name (depositor)
� Find Example Queries the names of all customers who have a loan at the Perryridge branch. customer_name ( branch_name=“Perryridge” ( borrower. loan_number = loan_number(borrower x loan))) n Find the names of all customers who have a loan at the Perryridge branch but do not have an account at any branch of the bank. customer_name ( branch_name = “Perryridge” ( borrower. loan_number = loan_number(borrower x loan))) – customer_name(depositor)
� Find Example Queries the names of all customers who have a loan at the Perryridge branch. l Query 1 customer_name ( branch_name = “Perryridge” ( borrower. loan_number = loan_number (borrower x loan))) l Query 2 customer_name( loan_number = borrower. loan_number ( ( branch_name = “Perryridge” (loan)) x borrower))
Example Queries � Find the largest account balance Strategy: ◦ Find those balances that are not the largest � Rename account relation as d so that we can compare each � account balance with all others Use set difference to find those account balances that were not found in the earlier step. � The query is: ◦ balance(account) - account. balance ( account. balance < d. balance (account x d (account)))
Set-Intersection Operation – Example Relation r, s: A B 1 2 1 r A B 2 3 s r s A B 2
Natural Join Operation – Example � Relations r, s: A B C D B D E 1 2 4 1 2 a a b 1 3 1 2 3 a a a b b r n r s s A B C D E 1 1 2 a a b
Division Operation – Example n Relations r, s: n r s: A A B B 1 2 3 1 1 1 3 4 6 1 2 1 r 2 s
Another Division Example n Relations r, s: A B C D E a a a a a a b a b b 1 1 3 1 1 1 a b 1 1 r n r s: A B C a a s
n Bank Example Queries Find the names of all customers who have a loan and an account at bank. customer_name (borrower) customer_name (depositor) � Find the name of all customers who have a loan at the bank and the loan amount
Bank Example Queries � Find all customers who have an account from at least the “Downtown” and the Uptown” l Query 1 branches. customer_name ( branch_name = “Downtown” (depositor customer_name ( branch_name = “Uptown” (depositor l account )) account)) Query 2 customer_name, branch_name (depositor account) temp(branch_name) ({(“Downtown” ), (“Uptown” )}) Note that Query 2 uses a constant relation.
Example Queries � Find all customers who have an account at all branches located in Brooklyn city. customer_name, branch_name (depositor account) branch_name ( branch_city = “Brooklyn” (branch))
Aggregate Operation – Example � Relation r: n g sum(c) (r) A B C 7 sum(c ) 27 7 3 10
Aggregate Operation – Example account grouped by branchname: branch_name account_number balance � Relation Perryridge Brighton Redwood branch_name A-102 A-201 A-217 A-215 A-222 400 900 750 700 g sum(balance) (account) branch_name Perryridge Brighton Redwood sum(balance) 1300 1500 700
Outer Join – Example � Relation loan_number branch_name amount L-170 L-230 L-260 Downtown Redwood Perryridge 3000 4000 1700 n Relation borrower customer_name loan_number Jones Smith Hayes L-170 L-230 L-155
Outer Join – Example � Inner Join loan Borrower loan_number branch_name amount customer_name L-170 L-230 Downtown Redwood 3000 4000 Jones Smith n Left Outer Join loan Borrower loan_number branch_name amount customer_name L-170 L-230 L-260 Downtown Redwood Perryridge 3000 4000 1700 Jones Smith null
Outer Join – Example n Right Outer Join loan borrower loan_number branch_name amount customer_name L-170 L-230 L-155 Downtown Redwood null Jones Smith Hayes 3000 4000 null n Full Outer Join loan borrower loan_number branch_name amount customer_name L-170 L-230 L-260 L-155 Downtown Redwood Perryridge null Jones Smith null Hayes 3000 4000 1700 null
Null Values � � � Comparisons with null values return the special truth value: unknown ◦ If false was used instead of unknown, then not (A < 5) would not be equivalent to A >= 5 Three-valued logic using the truth value unknown: ◦ OR: (unknown or true) = true, (unknown or false) = unknown (unknown or unknown) = unknown ◦ AND: (true and unknown) = unknown, (false and unknown) = false, (unknown and unknown) = unknown ◦ NOT: (not unknown) = unknown ◦ In SQL “P is unknown” evaluates to true if predicate P evaluates to unknown Result of select predicate is treated as false if it evaluates to unknown
- Slides: 51