- Slides: 54
Relational Algebra – Basis for Relational Query Languages Based on presentation by Juliana Freire
Formal relational query languages
Is this the Algebra you know? Algebra -> operators and atomic operands Expressions -> applying operators to atomic operands and/or other expressions Algebra of arithmetic: operands are variables and constants, and operators are the usual arithmetic operators E. g. , (x+y)*2 or ((x+7)/(y-3)) + x Relational algebra: operands are variables that stand for relations and relations (sets of tuples), and operations include union, intersection, selection, projection, Cartesian product, etc – E. g. , (π c-owner. Checking-account) ∩ (π s-owner. Savings-account)
What is a query? A query is applied to relation instances, and the result of a query is also a relation instance. (view, query) – Schemas of input and output fixed, but instances not. • Operators refer to relation attributes by position or name: – E. g. , Account(number, owner, balance, type) – Positional notation easier formal definitions, named-field notation more readable. – Both used in SQL
Relational Algebra Operations The usual set operations: union, intersection, difference • Operations that remove parts of relations: selection, projection • Operations that combine tuples from two relations: Cartesian product, join • Since each operation returns a relation, operations can be composed!
Removing Parts of Relations • Selection – rows • Projection - columns
Example of Projection
Projection removes duplicates
Set Operations • Union • Intersection • Difference
What happens when sets unite?
Union Operation – Example • Relations r, s: A B 1 2 2 3 1 s r n r s: A B 1 2 1 3
Set Difference Operation – Example • Relations r, s: A B 1 2 2 3 1 s r n r – s: A B 1 1
Another way to show intersection?
Summary so far: • • • E 1 U E 2 : union E 1 - E 2 : difference E 1 x E 2 : cartesian product c(E 1) : select rows, c = condition (book has p for predicate) IIs(E 1) : project columns : s =selected columns x(c 1, c 2) (E 1) : rename, x is new name of E 1, c 1 is new name of column
Combining Tuples of Two Relations • Cross product (Cartesian product) • Joins
Cartesian-Product Operation – Example n Relations r, s: A B C D E 1 2 10 10 20 10 a a b b r s n r x s: A B C D E 1 1 2 2 10 10 20 10 a a b b
Cross Product Example
Cross Product • How to resolve? ? Renaming operator: Rename whole relation: Teacher X secondteacher(Teacher) Teacher. t-num, Teacher. t-name, secondteacher. t-num, secondteacher. t-name OR rename attribute before combining: Teacher X secondteacher(t-num 2, t-name 2)(Teacher) t-num, t-name, t-num 2, t-name 2 OR rename after combining c(t-num 1, t-name 1, t-num 2, t-name 2)(Teacher X Teacher) t-num 1, t-name 1, t-num 2, t-name 2
Join : Example
Equi and Natural Join
When to divide?
Dividing without division sign
Working out an example
Why would we use Relational Algebra? ? ?
ER vs RA • Both ER and the Relational Model can be used to model the structure of a database. • Why is it the case that there are only Relational Databases and no ER databases?
RA vs Full Programming Language • Relational Algebra is not Turing complete. There are operations that cannot be expressed in relational algebra. • What is the advantage of using this language to query a database?
Summary of Operators updated • • Summary so far: E 1 U E 2 : union E 1 - E 2 : difference E 1 x E 2 : cartesian product c(E 1) : select rows, c = condition (book has p for predicate) IIs(E 1) : project columns : s =selected columns x(c 1, c 2) (E 1) : rename, x is new name of E 1, c 1 is new name of column • E 1 E 2 : division • E 1 E 2 : join, c = match condition
Practice • Find names of stars who’ve appeared in a 1994 movie • Information about movie year available in Movies; so need an extra join: σyear=1994(πname(Stars ⋈ Appear. In ⋈ Movies)) • A more efficient solution: πname(Stars ⋈ Appear. In ⋈ (σyear=1994( Movies)) • An even more efficient solution: πname(Stars ⋈ πname(Appear. In ⋈ (πmovie. Idσyear=1994(Movies))) A query optimizer can find this, given the first solution!
Extended Relational Algebra Operations • Generalized projection • Outer join • Aggregate functions
Generalized projection – calculate fields
Aggregate Operation – Example • Relatio n r: n g sum(c) (r) A B C 7 sum(c ) 27 7 3 10
Aggregate • Functions on more than one tuple • Samples: – – – Sum Count-distinct Max Min Count Avg • Use “as” to rename branchname g sum(balance) as totalbalance (account)
Aggregate Operation – Example • Relation account by branch-name: branch_namegrouped account_number balance Perryridge Brighton Redwood branch_name A-102 A-201 A-217 A-215 A-222 400 900 750 700 g sum(balance) (account) branch_name sum(balance) Perryridge Brighton Redwood 1300 1500 700
Outer Join • Keep the outer side even if no join • Fill in missing fields with nulls
Outer Join – Example • Relation loan_number branch_name L-170 L-230 L-260 Downtown Redwood Perryridge amount 3000 4000 1700 n Relation borrower customer_name loan_number Jones Smith Hayes L-170 L-230 L-155
Outer Join – Example • Inner Join loan Borrower loan_number branch_name L-170 L-230 Downtown Redwood amount customer_name 3000 4000 Jones Smith n Left Outer Join loan Borrower loan_number branch_name L-170 L-230 L-260 Downtown Redwood Perryridge amount customer_name 3000 4000 1700 Jones Smith null
Outer Join – Example n Right Outer Join loan borrower loan_number branch_name L-170 L-230 L-155 Downtown Redwood null amount customer_name 3000 4000 null Jones Smith Hayes n Full Outer Join loan borrower loan_number branch_name L-170 L-230 L-260 L-155 Downtown Redwood Perryridge null amount customer_name 3000 4000 1700 null Jones Smith null Hayes
Summary of Operators - Full • • • E 1 U E 2 : union E 1 - E 2 : difference E 1 x E 2 : cartesian product c(E 1) : select rows, c = condition (book has p for predicate) IIs(E 1) : project columns : s =selected columns separated by commas, can have calculations included x(c 1, c 2) (E 1) : rename, x is new name of E 1, c 1 is new name of column • E 1 E 2 : division • E 1 • • • E 1 E 2 : outer join, c = match condition, keep the side with the arrows : assignment – give a new name to an expression to make it easy to read as : rename a calculated column • attribute 1 g function (attribute 2) (E 1) : perform function on attribute 2 whenever attribute 1 changes E 2 : join, c = match condition