Relational Algebra Chapter 4 Part A Database Management
Relational Algebra Chapter 4, Part A Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 1
Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database. v Relational model supports simple, powerful QLs: v § § v Strong formal foundation based on logic. Allows for much optimization. Query Languages != programming languages! § § QLs not intended to be used for complex calculations. QLs support easy, efficient access to large data sets. Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 2
Formal Relational Query Languages v Two mathematical Query Languages form the basis for “real” languages (e. g. SQL), and for implementation: § Relational Algebra: More operational, very useful for representing execution plans. § Relational Calculus: Lets users describe what they want, rather than how to compute it. (Nonoperational, declarative. ) Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 3
Preliminaries A relation is a set of tuples. Therefore we assume no repetitions in tuples. v A query is applied to relation instances, and the result of a query is also a relation instance. v § § v Schemas of input relations for a query are fixed The schema for the result of a given query is also fixed! Determined by definition of query language constructs. Positional vs. named-field notation: § § Positional notation easier formal definitions, named-field notation more readable. Both used in SQL Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 4
R 1 Example Instances v v “Sailors” and “Reserves” S 1 relations for our examples. We’ll use positional or named field notation, assume that names of fields in query results are `inherited’ from names of fields in query input relations. S 2 Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 5
Relational Algebra v Basic operations: § § § v Additional operations: § v Selection ( ) Selects a subset of rows from relation. Projection ( ) Deletes unwanted columns from relation. Cross-product ( ) Allows us to combine two relations. Set-difference ( ) Tuples in reln. 1, but not in reln. 2. Union ( ) Tuples in reln. 1 or in reln. 2. Intersection, join, division, renaming. Since each operation returns a relation, operations can be composed! Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 6
Projection (1/2) v v Selects desired attributes (Deletes attributes that are not) in projection list. Example: What does it produce? Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 7
Projection (2/2) What does it produce? v v Schema of result contains exactly the fields in the projection list, with the same names that they had in the (only) input relation. Projection operator has to eliminate duplicates! (Why? ? ) Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 8
Selection (1/2) What does it produce? Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 9
Selection (2/2) v v Selects rows that satisfy selection condition. No duplicates in result! (Why? ) Schema of result identical to schema of (only) input relation. Result relation can be the input for another relational algebra operation! (Operator composition. ) Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 10
Union, Intersection, Set. Difference v v All of these operations take two input relations, which must be union-compatible: § Same number of fields. § `Corresponding’ fields have the same type. What is the schema of result? Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke S 1 S 2 11
Cross-Product Each row of S 1 is paired with each row of R 1. v Result schema has one field per field of S 1 and R 1, with field names `inherited’ if possible. § Conflict: Both S 1 and R 1 have a field called sid. v S 1 R 1 S 1 X R 1 § Renaming operator: Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 12
Joins v S 1 R 1 Condition Join: Result schema same as that of cross-product. v Fewer tuples than cross-product, might be able to compute more efficiently v Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 13
Joins v S 1 R 1 Equi-Join: A special case of condition join where the condition c contains only equalities. Result schema similar to cross-product, but only one copy of fields for which equality is specified. v Natural Join: Equijoin on all common fields. v Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 14
Division Not supported as a primitive operator, but useful for expressing queries like: Find sailors who have reserved all boats. v Let A have 2 fields, x and y; B have only field y: § A/B = v § § v i. e. , A/B contains all x tuples (sailors) such that for every y tuple (boat) in B, there is an xy tuple in A. Or: If the set of y values (boats) associated with an x value (sailor) in A contains all y values in B, the x value is in A/B. In general, x and y can be any lists of fields; y is the list of fields in B, and x y is the list of fields of A. Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 15
Examples of Division A/B B 1 B 2 B 3 A A/B 1 Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke A/B 2 A/B 3 16
EXAMPLES Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 17
Assume following relations/schemas Sailors Boats Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 18
Find names of sailors who’ve reserved boat #103 v Solution 1: v Solution 2: v Solution 3: Which one is more efficient? Why? Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 19
Find names of sailors who’ve reserved a red boat v Information about boat color only available in Boats; so need an extra join: v A more efficient solution: A query optimizer can find this, given the first solution! Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 20
Find sailors who’ve reserved a red or a green boat v Can identify all red or green boats, then find sailors who’ve reserved one of these boats: v Can also define Tempboats using union! (How? ) v What happens if is replaced by Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke in this query? 21
Find sailors who’ve reserved a red and a green boat v Previous approach won’t work! Must identify sailors who’ve reserved red boats, sailors who’ve reserved green boats, then find the intersection (note that sid is a key for Sailors): Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 22
Find the names of sailors who’ve reserved all boats v Uses division; schemas of the input relations to / must be carefully chosen: v To find sailors who’ve reserved all ‘Interlake’ boats: . . . Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 23
Find the names of sailors who have reserved at least two boats. v Hint: create res x res, then pick. Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 24
Find the sids of sailors with age over 20 who have not reserved a red boat. Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 25
Summary The relational model has rigorously defined query languages that are simple and powerful. v Relational algebra is more operational; useful as internal representation for query evaluation plans. v Several ways of expressing a given query; a query optimizer should choose the most efficient version. v Database Management Systems 3 ed, R. Ramakrishnan and J. Gehrke 26
- Slides: 26