Relational Algebra Instructor Mohamed Eltabakh meltabakhcs wpi edu
Relational Algebra Instructor: Mohamed Eltabakh meltabakh@cs. wpi. edu 1
Announcements l Project-Phase 1 is due NOW !!! l Project-Phase 2 is out today (Nov. 4) and due on Nov. 11 l Submission Guidelines (to make it easier for grading) l l Submit single file (word or pdf) Make sure your name (or username) is specified Groups submit single copy of the project phases Pickup your graded hardcopy submission from TAs during their office hours 2
Relational Model (Recap) l l l Relations (Tables) + Attributes (Columns) Integrity constraints Create “Students” relation CREATE TABLE Students (sid: CHAR(20) Primary Key, name: CHAR(20), login: CHAR(10), age: INTEGER, gpa: REAL); Create “Courses” relation l CREATE TABLE Courses (cid: Varchar(20) Primary Key, l name: string, max. Credits : integer, graduate. Flag: boolean); Create “Enrolled” relation CREATE TABLE Enrolled (sid: CHAR(20) Foreign Key References (Students. sid), cid: Varchar(20), enroll. Date: date, grade: CHAR(2), Constraints fk_cid Foreign Key cid References (Courses. cid)); 3
What about Converting this ERD to Relational Model status date 4
Query Language l l l Define data retrieval operations for relational model Express easy access to large data sets in high-level language, not complex application programs Categories of languages l l Procedural: What you want and how to get it Non-procedural, or declarative: What you want (without how) l SQL: High-level language for relational algebra. l Relational Algebra : Operator semantics based on set or bag theory l Relational algebra form underlying basis (and optimization rules) for SQL 5
Relational Algebra l Basic operators l l l Set Operations (Union: ∪, Intersection: ∩ , difference: – ) Select: σ Project: π Cartesian product: x rename: ρ l More advanced operators, e. g. , grouping and joins l The operators take one or two relations as inputs and produce a new relation as an output l One input unary operator, two inputs binary operator 6
Relational Algebra l Allows to build expressions using composition of the available operators l For example, arithmetic expressions are expressions of operators l l (w + t) / ((x + y) * 3) In relational algebra, instead of variables we have relations 7
Set Operators l Union, Intersection, Difference l Defined only for union compatible relations l Relations are union compatible iff l l l they have same sets of attributes (schema), and the same types (domains) of attributes Example : Union compatible or not? l l Student (s. Number, s. Name) Course (c. Number, c. Name) Not compatible 8
Union: l Consider two relations R and S that are union-compatible R R S S A B A B 1 2 1 2 3 4 3 4 5 6 9
Union: l Notation: R ∪ S l Defined as: l l R ∪S = {t | t∈R or t∈S} For R ∪S to be valid, they have to be union-compatible 10
Difference: l R – S are the tuples that appear in R and not in S l Defined as: l R – S = {t | t ∈R and t∈ S} R S R–S A B A B 1 2 3 4 5 6 11
Intersection: ∩ l Consider two Relations R and S that are unioncompatible R S R∩ S A B A B 1 2 1 2 3 4 3 4 5 6 12
Intersection: ∩ l Notation: R ∩ S l Defined as: l l R ∩ S = { t | t ∈ r and t ∈ s } Note: R ∩ S = R– (R–S) S R 13
Selection: σ l Select: σc (R): l l c is a condition on R’s attributes Select subset of tuples from R that satisfy selection condition c σ(C ≥ 6) (R) R A B C 1 2 5 3 4 6 1 2 7 14
Selection: σ l Notation: σc(R) l c is called the selection predicate l Defined as: l l σc(R) = {t | t ∈ R and c(t) is true} c is a formula in propositional calculus consisting of terms connected by : l l ∧ (and), ∨ (or), ¬ (not) Each term is one of: l l l <attribute> op <attribute> | <attribute> op <constant> op is one of: =, = , >, ≥. <. ≤ Example of selection: l σ branch_name=“Perryridge” ^ balance>1000 (account) 15
Selection: Example R σ ((A=B) ^ (D>5)) (R) 16
Project: π l πA 1, A 2, …, An (R), with A 1, A 2, …, An attributes AR l returns all tuples in R, but only columns A 1, A 2, …, An l A 1, A 2, …, An are called Projection List πA, C (R) R A B C 1 2 5 3 4 6 1 2 7 1 2 8 A C 1 5 3 6 1 7 1 8 17
Cross Product (Cartesian Product): X RXS R S 18
Cross Product (Cartesian Product): X l Notation R x S l Defined as: l l R x S = {t q | t ∈ r and q ∈ s} Assume that attributes are all unique, otherwise renaming must be used 19
Renaming: ρ l ρS (R) changes relation name from R to S l ρS(A 1, A 2, …, An) (R) renames also attributes of R to A 1, A 2, …, An ρS(X, C, D) (R) ρS (R) R S S B C D X C D 2 3 10 2 3 11 6 7 12 20
Composition of Operations l l Can build expressions using multiple operations Example: σA=C(R x S) RXS R S σA=C(R x S) 21
Banking Example l branch (branch_name, branch_city, assets) l customer (customer_name, customer_street, customer_city) l account (account_number, branch_name, balance) l loan (loan_number, branch_name, amount) l depositor (customer_name, account_number) l borrower (customer_name, loan_number) 22
Example Queries 23
Example Queries (Cont’d) 24
Example Queries (Cont’d) 25
Example Queries (Cont’d) 26
Example Queries (Cont’d) 27
More Operators 28
Natural Join: R ⋈ S l Consider relations l R with attributes AR, and l S with attributes AS. l Let A = AR ∩ AS = {A 1, A 2, …, An} – The common attributes l In English l Natural join R ⋈ S is a Cartesian Product R X S with equality predicates on the common attributes (Set A) 29
Natural Join: R ⋈ S l R ⋈ S can be defined as : Project the union of all πAR – A, A, AS - A attributes (σR. A 1 = S. A 1 AND R. A 2 = S. A 2 AND … R. An = S. An (R X S)) Equality on common attributes Cartesian Product 30
Natural Join: R ⋈ S: Example R S R⋈S 31
Theta Join: R ⋈C S l l Theta Join is cross product, with condition C It is defined as : R ⋈C S = (σC (R X S)) R Theta join can express both Cartesian Product & Natural Join S A B D C 1 2 2 3 3 2 4 5 R ⋈ R. A>=S. CS A B D C 3 2 2 3 32
Outer Join l An extension of the join operation that avoids loss of information l Computes the join and then adds tuples form one relation that does not match tuples in the other relation to the result l Uses null values to fill in empty attributes with no matching l Types of outer join between R and S l l l Left outer (R o⋈ S): preserve all tuples from the left relation R Right outer (R ⋈o S): preserve all tuples from the right relation S o Full outer (R ⋈ S): preserve all tuples from both relations 33
Left Outer Join (R o⋈ S): Example R S R⋈S (R o⋈ S) 34
Right Outer Join (R ⋈o S): Example R S R⋈S (R ⋈o S) 35
Full Outer Join (R ⋈ S): Example o R S R⋈S o (R ⋈ S) 36
Assignment Operator: l The assignment operation (←) provides a convenient way to express complex queries on multiple line l Write query as a sequence of line consisting of: l l Series of assignments Result expression containing the final answer l Assignment must always be made to a temporary relation variable l May use a variable multiple times in subsequent expressions l Example: l R 1 (σ ((A=B) ^ (D>5)) (R – S)) ∩ W R 2 R 1 ⋈(R. A = T. C) T l Result R 1 U R 2 l 37
Duplicate Elimination: (R) l l Delete all duplicate records Convert a bag to a set R (R) A B 1 2 A B 3 4 1 2 38
Extended Projection: πL (R) l Standard project l l L contains only column names of R Extended projection l L may contain expressions and assignment operators π C, V A, X C*3+D (R) 39
Grouping & Aggregation operator: l Aggregation function takes a collection of values and returns a single value as a result l l l avg: average value min: minimum value max: maximum value sum: sum of values count: number of values Grouing & Aggregate operation in relational algebra l l g 1, g 2, …gm, F 1(A 1), F 2(A 2), …Fn(An) (R) R is a relation or any relational-algebra expression g 1, g 2, …gm is a list of attributes on which to group (can be empty) Each Fi is an aggregate function applied on attribute Ai within each group 40
Grouping & Aggregation Operator: Example R sum(c)(R) S branch_name. sum(balance)(S) 41
Summary of Relational-Algebra Operators l Set operators l Union, Intersection, Difference l Selection & Projection & Extended Projection l Joins l Natural, Theta, Outer join l Rename & Assignment l Duplicate elimination l Grouping & Aggregation 42
Example Queries Find customer names having loans with sum > 20, 000 πcustomer_name (σsum > 20, 000 ( customer_name, sum(amount)(loan ⋈ borrower))) 43
Example Queries Find the branch name with the largest number of accounts R 1 branch_name. count. Accounts count(account_number)(account) R 2 Max max(count. Accounts)(R 1) Result πbranch_name(R 1 ⋈count. Accounts = Max R 2) 44
Example Queries Find customers having account balance below 100 or above 10, 000 πcustomer_name (depositor ⋈ πaccount_number(σbalance <100 OR balance > 10, 000 (account))) 45
Example Queries Find customers having account balance below 100 and loans above 10, 000 R 1 πcustomer_name (depositor ⋈ πaccount_number(σbalance <100 (account))) R 2 πcustomer_name (borrower ⋈ πloan_number(σamount >10, 000 (loan))) Result R 1 ∩ R 2 46
Example Queries Find account numbers and balances for customers having loans > 10, 000 πaccount_number, balance ( (depositor ⋈ account) ⋈ (πcustomer_name (borrower ⋈ (σamount >10, 000 (loan)))) ) 47
Reversed Queries (what does it do)? πcustomer_name(customer) (πcustomer_name(borrower) U πcustomer_name(depositer)) Find customers who neither have accounts nor loans 48
Reversed Queries (what does it do)? R 1 ( Max. Loan max(amount)(σbranch_name= “ABC” (loan))) Result πcustomer_name(borrower ⋈ (R 1 ⋈Max. Loan=amount^branch_name= “ABC” loan)) Find customer name with the largest loan from a branch “ABC” 49
- Slides: 49