# Chapter 3 Relational Model and Relational Algebra Calculus

• Slides: 111

Chapter 3: Relational Model and Relational Algebra & Calculus ( [S] Chp. 2 and 5 ) v Structure of Relational Databases v Relational Algebra v Tuple Relational Calculus v Domain Relational Calculus v Extended Relational-Algebra-Operations v Modification of the Database v Views Database System Concepts 3. 1 ©Silberschatz, Korth and Sudarshan

Why Study the Relational Model? v Most widely used model. • Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. v “Legacy systems” in older models • E. G. , IBM’s IMS v Recent competitor: object-oriented model • Object. Store, Versant, Ontos • A synthesis emerging: object-relational model § Informix Universal Server, Uni. SQL, O 2, Oracle, DB 2 Database System Concepts 3. 2 ©Silberschatz, Korth and Sudarshan

Basic Structure v Formally, given sets D 1, D 2, …. Dn, a relation r is a subset of D 1 x D 2 x … x Dn Each Di is a Domain Thus a relation is a set of n-tuples (a 1, a 2, …, an) where each ai Di v Example: if customer-name = {Jones, Smith, Curry, Lindsay} customer-street = {Main, North, Park} customer-city = {Harrison, Rye, Pittsfield} Then r = { (Jones, Main, Harrison), (Smith, North, Rye), (Curry, North, Rye), (Lindsay, Park, Pittsfield)} is a relation over customer-name x customer-street x customer-city Database System Concepts 3. 3 ©Silberschatz, Korth and Sudarshan

Attribute Types v Each attribute of a relation has a name v The set of allowed values for each attribute is called the domain of the attribute v Attribute values are (normally) required to be atomic, that is, indivisible • E. g. multivalued attribute values are not atomic • E. g. composite attribute values are not atomic v The special value null is a member of every domain v The null value causes complications in the definition of many operations • we shall ignore the effect of null values in our main presentation and consider their effect later Database System Concepts 3. 4 ©Silberschatz, Korth and Sudarshan

Relation Schema v A 1, A 2, …, An are attributes v R = (A 1, A 2, …, An ) is a relation schema E. g. Customer-schema = (customer-name, customer-street, customer-city) v r(R) is a relation on the relation schema R E. g. customer (Customer-schema) Our common notation: R – relational schema, r- relation instance When we say “relation” we will mean relation instance or relation schema depending on the context Database System Concepts 3. 5 ©Silberschatz, Korth and Sudarshan

Relation Instance v The current values (relation instance) of a relation are specified by a table v An element t of r is a tuple, represented by a row in a table v The number of attributes is the relation degree, the number of rows is the relation cardinality attributes (or columns) customer-name customer-street Jones Smith Curry Lindsay Main North Park customer-city Harrison Rye Pittsfield tuples (or rows) customer Database System Concepts 3. 6 ©Silberschatz, Korth and Sudarshan

Relations are Unordered v Order of tuples is irrelevant (tuples may be stored in an arbitrary order) v E. g. account relation with unordered tuples Database System Concepts 3. 7 ©Silberschatz, Korth and Sudarshan

Example Instance of Students Relation v. Cardinality = 3, degree = 5, all rows distinct Database System Concepts 3. 8 ©Silberschatz, Korth and Sudarshan

Database v A database consists of multiple relations v Information about an enterprise is broken up into parts (usually an entity), with each relation storing one part of the information E. g. : account : stores information about accounts depositor : stores information about which customer owns which account customer : stores information about customers v Storing all information as a single relation such as bank(account-number, balance, customer-name, . . ) results in • repetition of information (e. g. two customers own an account) • the need for null values (e. g. represent a customer without an account) v Normalization theory (Chapter 7) deals with how to design relational schemas Database System Concepts 3. 9 ©Silberschatz, Korth and Sudarshan

E-R Diagram for the Banking Enterprise Database System Concepts 3. 10 ©Silberschatz, Korth and Sudarshan

Relation Schema of the Bank Example Database System Concepts 3. 11 ©Silberschatz, Korth and Sudarshan

The customer Relation Database System Concepts 3. 12 ©Silberschatz, Korth and Sudarshan

The depositor Relation Database System Concepts 3. 13 ©Silberschatz, Korth and Sudarshan

The Account Relation Database System Concepts 3. 14 ©Silberschatz, Korth and Sudarshan

Keys v Let K R v K is a superkey of R if values for K are sufficient to identify a unique tuple of each possible relation r(R) • by “possible r” we mean a relation r that could exist in the enterprise we are modeling. • Example: {customer-name, customer-street} and {customer-name} are both superkeys of Customer, if no two customers can possibly have the same name. v K is a candidate key if K is minimal • Example: {customer-name} is a candidate key for Customer. • Since it is a superkey, and no subset of it is a superkey. § Assuming no two customers can possibly have the same name. Database System Concepts 3. 15 ©Silberschatz, Korth and Sudarshan

Determining Keys from E-R Sets v Strong entity set. • The primary key of the entity set becomes the primary key of the relation. v Weak entity set. • The primary key of the relation consists of the union of the primary key of the strong entity set and the discriminator of the weak entity set. v Relationship set. • The union of the primary keys of the related entity sets becomes a super key of the relation. § For binary many-to-one relationship sets § The primary key of the “many” entity set becomes the relation’s primary key. § For one-to-one relationship sets, the relation’s primary key can be that of either entity set. § For many-to-many relationship sets § The union of the primary keys becomes the relation’s primary key Database System Concepts 3. 16 ©Silberschatz, Korth and Sudarshan

Domain Constraints v Integrity constraints guard against accidental damage to the database, by ensuring that authorized changes to the database do not result in a loss of data consistency. v Domain constraints are the most elementary form of integrity constraint. v They test values inserted in the database, and test queries to ensure that the comparisons make sense. v New domains can be created from existing data types • E. g. create domain Dollars numeric(12, 2) create domain Pounds numeric(12, 2) v We cannot assign or compare a value of type Dollars to a value of type Pounds. • However, we can convert type as below (cast r. A as Pounds) (Should also multiply by the dollar-to-pound conversion-rate) Database System Concepts 3. 24 ©Silberschatz, Korth and Sudarshan

Domain Constraints (Cont. ) v The check clause in SQL-92 permits domains to be restricted: • Use check clause to ensure that an hourly-wage domain allows only values greater than a specified value. create domain hourly-wage numeric(5, 2) constraint value-test check(value > = 4. 00) • The domain has a constraint that ensures that the hourly-wage is greater than 4. 00 • The clause constraint value-test is optional; useful to indicate which constraint an update violated. v Can have complex conditions in domain check • create domain Account. Type char(10) constraint account-type-test check (value in (‘Checking’, ‘Saving’)) • check (branch-name in (select branch-name from branch)) Database System Concepts 3. 25 ©Silberschatz, Korth and Sudarshan

Referential Integrity v Ensures that a value that appears in one relation for a given set of attributes also appears for a certain set of attributes in another relation. • Example: If “Perryridge” is a branch name appearing in one of the tuples in the account relation, then there exists a tuple in the branch relation for branch “Perryridge”. v Formal Definition • Let r 1(R 1) and r 2(R 2) be relations with primary keys K 1 and K 2 respectively. • The subset of R 2 is a foreign key referencing K 1 in relation r 1, if for every t 2 in r 2 there must be a tuple t 1 in r 1 such that t 1[K 1] = t 2[ ]. • Referential integrity constraint is also called subset dependency since its can be written as (r 2) K 1 (r 1) Database System Concepts 3. 26 ©Silberschatz, Korth and Sudarshan

Referential Integrity in the E-R Model v Consider relationship set R between entity sets E 1 and E 2. The relational schema for R includes the primary keys K 1 of E 1 and K 2 of E 2. Then K 1 and K 2 form foreign keys on the relational schemas for E 1 and E 2 respectively. E 1 R E 2 v Weak entity sets are also a source of referential integrity constraints. • For the relation schema for a weak entity set must include the primary key attributes of the entity set on which it depends Database System Concepts 3. 27 ©Silberschatz, Korth and Sudarshan

Checking Referential Integrity on Database Modification v The following tests must be made in order to preserve the following referential integrity constraint: (r 2) K (r 1) v Insert. If a tuple t 2 is inserted into r 2, the system must ensure that there is a tuple t 1 in r 1 such that t 1[K] = t 2[ ]. That is t 2 [ ] K (r 1) v Delete. If a tuple, t 1 is deleted from r 1, the system must compute the set of tuples in r 2 that reference t 1: = t 1[K] (r 2) If this set is not empty • either the delete command is rejected as an error, or • the tuples that reference t 1 must themselves be deleted (cascading deletions are possible). Database System Concepts 3. 28 ©Silberschatz, Korth and Sudarshan

Database Modification (Cont. ) v Update. There are two cases: • If a tuple t 2 is updated in relation r 2 and the update modifies values foreign key , then a test similar to the insert case is made: § Let t 2’ denote the new value of tuple t 2. The system must ensure that t 2’[ ] K(r 1) • If a tuple t 1 is updated in r 1, and the update modifies values for the primary key (K), then a test similar to the delete case is made: 1. The system must compute = t 1[K] (r 2) using the old value of t 1 (the value before the update is applied). 2. If this set is not empty 1. the update may be rejected as an error, or 2. the update may be cascaded to the tuples in the set, or 3. the tuples in the set may be deleted. Database System Concepts 3. 29 ©Silberschatz, Korth and Sudarshan

Referential Integrity in SQL v Primary and candidate keys and foreign keys can be specified as part of the SQL create table statement: • The primary key clause lists attributes that comprise the primary key. • The unique key clause lists attributes that comprise a candidate key. • The foreign key clause lists the attributes that comprise the foreign key and the name of the relation referenced by the foreign key. v By default, a foreign key references the primary key attributes of the referenced table foreign key (account-number) references account v Short form for specifying a single column as foreign key account-number char (10) references account v Reference columns in the referenced table can be explicitly specified • but must be declared as primary/candidate keys foreign key (account-number) references account(account-number) Database System Concepts 3. 30 ©Silberschatz, Korth and Sudarshan

Referential Integrity in SQL – Example create table customer (customer-name char(20), customer-street char(30), customer-city char(30), primary key (customer-name)) create table branch (branch-name char(15), branch-city char(30), assets integer, primary key (branch-name)) Database System Concepts 3. 31 ©Silberschatz, Korth and Sudarshan

Referential Integrity in SQL – Example (Cont. ) create table account (account-number char(10), branch-name char(15), balance integer, primary key (account-number), foreign key (branch-name) references branch) create table depositor (customer-name char(20), account-number char(10), primary key (customer-name, account-number), foreign key (account-number) references account, foreign key (customer-name) references customer) Database System Concepts 3. 32 ©Silberschatz, Korth and Sudarshan

Cascading Actions in SQL create table account. . . foreign key(branch-name) references branch on delete cascade on update cascade. . . ) v Due to the on delete cascade clauses, if a delete of a tuple in branch results in referential-integrity constraint violation, the delete “cascades” to the account relation, deleting the tuple that refers to the branch that was deleted. v Cascading updates are similar. Database System Concepts 3. 33 ©Silberschatz, Korth and Sudarshan

Cascading Actions in SQL (Cont. ) v If there is a chain of foreign-key dependencies across multiple relations, with on delete cascade specified for each dependency, a deletion or update at one end of the chain can propagate across the entire chain. v If a cascading update to delete causes a constraint violation that cannot be handled by a further cascading operation, the system aborts the transaction. • As a result, all the changes caused by the transaction and its cascading actions are undone. v Referential integrity is only checked at the end of a transaction • Intermediate steps are allowed to violate referential integrity provided later steps remove the violation • Otherwise it would be impossible to create some database states, e. g. insert two tuples whose foreign keys point to each other § E. g. spouse attribute of relation marriedperson(name, address, spouse) Database System Concepts 3. 34 ©Silberschatz, Korth and Sudarshan

Referential Integrity in SQL (Cont. ) v Alternative to cascading: • on delete set null • on delete set default v Null values in foreign key attributes complicate SQL referential integrity semantics, and are best prevented using not null • if any attribute of a foreign key is null, the tuple is defined to satisfy the foreign key constraint! Database System Concepts 3. 35 ©Silberschatz, Korth and Sudarshan

Foreign keys and Relationships v In the relational model foreign keys represent relationships v In contrast to primary (candidate) keys foreign keys may be null representing partial relationship. v One to one is represented as a foreign key in one of the participating relations. Always prefer the relation where the participation is total to avoid null values v Many to one is represented as foreign key in the Many relation v Many to Many is represented as two foreign keys for each of the two relations Database System Concepts 3. 36 ©Silberschatz, Korth and Sudarshan

Mapping ER to Relational and Vice-versa v ER to Relational – see slides in Chp 2 v Relational to ER. Must determine for each foreign key which type of relationship it represents (may not be unique) or existence of a weak entity set. Database System Concepts 3. 37 ©Silberschatz, Korth and Sudarshan

Schema Diagram for the Banking Enterprise (another style to denote foreign keys…) Database System Concepts 3. 38 ©Silberschatz, Korth and Sudarshan

Database System Concepts 3. 39 ©Silberschatz, Korth and Sudarshan

Possible relational database state corresponding to the COMPANY scheme Database System Concepts 3. 40 ©Silberschatz, Korth and Sudarshan

SQL - הגדרות בסיס נתונים ב CREATE TABLE EMPOLYEE ( FNAME VARCHAR(15) NOT NULL. MINIT CHAR. LNAME VARCHAR(15) NOT NULL. SSN CHAR(9) BDATE ADDRESS VARCHAR(30). SEX CHAR. SALARY DECIMAL(10, 2). SUPERSSN CHAR(9). DNO INT NOT NULL. PRIMARY KEY (SSN). FOREIGN KEY (SUPERSSN) REFERENCES EMPLOYEE (SSN), FOREIGN KEY (DNO) REFERENCES DEPARTMENT (DNUMBER)); CREATE TABLE DEPARTMENT ( DNAME VARCHAR(15) NOT NULL DNUMBER INT NOT NULL MGRSSN CHR(9) NOT NULL MGRSTARTDATE, PRIMARY KEY (DNUMBER) UNIQUE (DNAME) FOREIGN KEY (MGRSSN) REFERENCES EMPLOYEE (SSN)); CREATE TABLE DEPT_LOCATIONS ( DNUMBER DLOCATION INT NOT NULL, VARCHAR(15) NOT NULL, PRIMARY KEY (DNUMBER, DLOCATION), FOREIGN KEY (DNUMBER) REFERENCES DEPARTMENT (DNUMBER) Database System Concepts 3. 41 ©Silberschatz, Korth and Sudarshan

SQL - הגדרות בסיס נתונים ב CREATE TABLE PROJECT ( PNAME VARCHAR(15) NOT NULL, PNUMBER INT NOT NULL, PLOCATION VARCHAR(15). DNUM INT NOT NULL, PRIMARY KEY (PNUMBER) UNIQUE (PNAME) FOREIGN KEY (DNUM) REFERENCES DEPARTMENT (DNUMBER) ); CREATE TABLE WORKS_ON ( ESSN CHAR(9) NOT NULL, PNO INT NOT NULL, HOURS DECIMAL(3, 1) NOT NULL, PRIMARY KEY (ESSN, PNO), FOREIGN KEY (ESSN) REFERENCES EMPLOYEE (SSN), FOREIGN KEY (PNO) REFERENCES PROJECT (PNUMBER) ); CREATE TABLE DEPENDENT ( ESSN CHAR(9) NOT NULL, DEPENDENT_NAME VARCHR(15)NOT NULL, SEX CHAR, BDATE, RELATIONSHIP VARCHAR(8), PRIMARY KEY (ESSN, DEPENDENR_NAME), FOREIGN KEY (ESSN) REFERENCES EMPLOYEE (SSN) ); Database System Concepts 3. 42 ©Silberschatz, Korth and Sudarshan

Query Languages v Language in which user requests information from the database. v Categories of languages • procedural • non-procedural v “Pure” languages: • Relational Algebra • Tuple Relational Calculus • Domain Relational Calculus v Pure languages form underlying basis of query languages that people use. Database System Concepts 3. 43 ©Silberschatz, Korth and Sudarshan

Relational Algebra v Procedural language v Six basic operators • select • project • union • set difference • Cartesian product • rename v The operators take one, two or more relations as inputs and give a new relation as a result. Database System Concepts 3. 44 ©Silberschatz, Korth and Sudarshan

אלגברה טבלאית B select – בחירה v A, B, C project – הטלה v Ax. B – קרטזית מכפלה v U Union – איחוד v ∩ Intersection – חיתוך v - Difference – הפרש v B JOIN - צימוד % Database System Concepts 3. 45 v Division – חילוק v ©Silberschatz, Korth and Sudarshan

Select Operation – Example • Relation r A B C D � � 1 7 � � 5 7 � � 12 3 � � 23 10 • �A=B ^ D > 5 (r) A B C D � � 1 7 � � 23 10 Database System Concepts 3. 46 ©Silberschatz, Korth and Sudarshan

Select Operation v Notation: p(r) v p is called the selection predicate v Defined as: p(r) = {t | t r and p(t)} Where p is a formula in propositional calculus consisting of terms connected by : (and), (or), (not) Each term is one of: <attribute> op <attribute> or <constant> where op is one of: =, , >, . <. v Example of selection: branch-name=“Perryridge”(account) Database System Concepts 3. 47 ©Silberschatz, Korth and Sudarshan

Project Operation – Example A v Relation r: v �A, C (r) Database System Concepts B C � 10 1 � 20 1 � 30 1 � 40 2 A C � 1 � 1 � 1 � 2 = 3. 48 ©Silberschatz, Korth and Sudarshan

Project Operation v Notation: A 1, A 2, …, Ak (r) where A 1, A 2 are attribute names and r is a relation name. v The result is defined as the relation of k columns obtained by erasing the columns that are not listed v Duplicate rows removed from result, since relations are sets v E. g. To eliminate the branch-name attribute of account-number, balance (account) Database System Concepts 3. 49 ©Silberschatz, Korth and Sudarshan

Union Operation – Example v Relations r, s: A B � 1 � 2 � 3 � 1 s r r � s: Database System Concepts A B � 1 � 2 � 1 � 3 3. 50 ©Silberschatz, Korth and Sudarshan

Union Operation v Notation: r s v Defined as: r s = {t | t r or t s} v For r s to be valid. 1. r, s must have the same arity (same number of attributes) 2. The attribute domains must be compatible (e. g. , 2 nd column of r deals with the same type of values as does the 2 nd column of s, (i. e. usually the same schema )) v E. g. to find all customers with either an account or a loan customer-name (depositor) customer-name (borrower) Database System Concepts 3. 51 ©Silberschatz, Korth and Sudarshan

Set Difference Operation – Example v Relations r, s: A B � 1 � 2 � 3 � 1 s r r – s: Database System Concepts A B � 1 3. 52 ©Silberschatz, Korth and Sudarshan

Set Difference Operation v Notation r – s v Defined as: r – s = {t | t r and t s} v Set differences must be taken between compatible relations. • r and s must have the same arity • attribute domains of r and s must be compatible Database System Concepts 3. 53 ©Silberschatz, Korth and Sudarshan

Set-Intersection Operation v Notation: r s v Defined as: v r s ={ t | t r and t s } v Assume: • r, s have the same arity • attributes of r and s are compatible v Note: r s = r - (r - s) Database System Concepts 3. 54 ©Silberschatz, Korth and Sudarshan

Set-Intersection Operation - Example v Relation r, s: A B � � � 1 2 1 A � � r v r s Database System Concepts A B � 2 B 2 3 s 3. 55 ©Silberschatz, Korth and Sudarshan

Database System Concepts 3. 56 ©Silberschatz, Korth and Sudarshan

Possible relational database state corresponding to the COMPANY scheme Database System Concepts 3. 57 ©Silberschatz, Korth and Sudarshan

Database System Concepts 3. 58 ©Silberschatz, Korth and Sudarshan

Database System Concepts 3. 59 ©Silberschatz, Korth and Sudarshan

∩ Database System Concepts 3. 60 ©Silberschatz, Korth and Sudarshan

Cartesian-Product Operation-Example Relations r, s: A B C D E � 1 � 2 � � 10 10 20 10 a a b b r s r x s: Database System Concepts A B C D E � � � � 1 1 2 2 � � � � 10 10 20 10 a a b b 3. 61 ©Silberschatz, Korth and Sudarshan

Cartesian-Product Operation v Notation r x s v Defined as: r x s = {t q | t r and q s} v Assume that attributes of r(R) and s(S) are disjoint. (That is, R S = ). v If attributes of r(R) and s(S) are not disjoint, then renaming must be used. Database System Concepts 3. 62 ©Silberschatz, Korth and Sudarshan

Rename Operation v Allows us to name, and therefore to refer to, the results of relational-algebra expressions. v Allows us to refer to a relation by more than one name. Example: x (E) returns the expression E under the name X If a relational-algebra expression E has arity n, then x (A 1, A 2, …, An) (E) returns the result of expression E under the name X, and with the attributes renamed to A 1, A 2, …. , An. Database System Concepts 3. 63 ©Silberschatz, Korth and Sudarshan

Composition of Operations v Can build expressions using multiple operations v Example: A=C(r x s) v rxs v A=C(r x s) Database System Concepts A B C D E � � � � 1 1 2 2 � � � � 10 10 20 10 a a b b A B C D E � � � 1 2 2 � 10 � 20 a a b 3. 64 ©Silberschatz, Korth and Sudarshan

Join or Theta Join Selection over a cartesian product R B S B (Rx. S) Meaning: For every row r of R output all rows s of S which satisfy condition B. Database System Concepts 3. 65 ©Silberschatz, Korth and Sudarshan

Natural-Join Operation n Notation: r s v Let r and s be relations on schemas R and S respectively. Then, r s is a relation on schema R S obtained as follows: • Consider each pair of tuples tr from r and ts from s. • If tr and ts have the same value on each of the attributes in R S, add a tuple t to the result, where § t has the same value as t on r r § t has the same value as t on s s v Example: R = (A, B, C, D) S = (E, B, D) • Result schema = (A, B, C, D, E) • r s is defined as: r. A, r. B, r. C, r. D, s. E ( r. B = s. B r. D = s. D (r x s)) Database System Concepts 3. 66 ©Silberschatz, Korth and Sudarshan

Natural Join Operation – Example v Relations r, s: A B C D B D E � � � 1 2 4 1 2 � � � a a b 1 3 1 2 3 a a a b b � � � r r s Database System Concepts s A B C D E � � � 1 1 2 � � � a a b � � � 3. 67 ©Silberschatz, Korth and Sudarshan

Natural-Join Operation – my definition n Notation: r s also r*s v Let r and s be relations on schemas R and S respectively. Then, r s is a relation on schema R S obtained as follows: • r and s are joined by some Equi-join • The redundant (duplicate) attributes are removed v Example: R = (A, B, C, D) S = (E, B, D) • The equi-join may be on B only • Examples Dept natural join Emp on Emp-id • Dept natural join Emp on Mgr-id Importance: natural joins along foreign key express Relationship! To avoid confusion: write the predicate B explicitly! Database System Concepts 3. 68 ©Silberschatz, Korth and Sudarshan

Possible relational database state corresponding to the COMPANY scheme Database System Concepts 3. 69 ©Silberschatz, Korth and Sudarshan

Database System Concepts 3. 70 ©Silberschatz, Korth and Sudarshan

Database System Concepts 3. 71 ©Silberschatz, Korth and Sudarshan

Division Operation r � s v Suited to queries that include the phrase “for all”. v Let r and s be relations on schemas R and S respectively where • R = (A 1, …, Am, B 1, …, Bn) • S = (B 1, …, Bn) The result of r s is a relation on schema R – S = (A 1, …, Am) r s = { t | t R-S(r) u s ( tu r ) } Database System Concepts 3. 72 ©Silberschatz, Korth and Sudarshan

Division Operation – Example Relations r, s: r �s: A A B B � � � 1 2 3 1 1 1 3 4 6 1 2 s r � � Database System Concepts 3. 73 ©Silberschatz, Korth and Sudarshan

Another Division Example Relations r, s: A B C D E � � � � a a a a � � � � a a b a b b 1 1 3 1 1 1 a b 1 1 s r r �s: Database System Concepts A B C � � a a � � 3. 74 ©Silberschatz, Korth and Sudarshan

Division Operation (Cont. ) v Property • Let q = r s • Then q is the largest relation satisfying q x s r v Definition in terms of the basic algebra operation Let r(R) and s(S) be relations, and let S R r s = R-S (r) – R-S ( ( R-S (r) x s) – R-S, S(r)) To see why • R-S, S(r) simply reorders attributes of r • T = R-S( R-S (r) x s) – R-S, S(r)) gives those tuples t in R-S (r) such that for some tuple u s, tu r. • Therefore R-S (r) - T is what we need! Database System Concepts 3. 75 ©Silberschatz, Korth and Sudarshan

Illustrating the division operation (a)Dividing SSN_PNOS by SMITH_PNOS. (b) T <= R / S Database System Concepts 3. 76 ©Silberschatz, Korth and Sudarshan

Banking Example branch (branch-name, branch-city, assets) customer (customer-name, customer-street, customer-only) account (account-number, branch-name, balance) loan (loan-number, branch-name, amount) depositor (customer-name, account-number) borrower (customer-name, loan-number) Database System Concepts 3. 77 ©Silberschatz, Korth and Sudarshan

Example Queries v Find all loans of over \$1200 amount > 1200 (loan) v Find the loan number for each loan of an amount greater than \$1200 loan-number ( amount > 1200 (loan)) Database System Concepts 3. 78 ©Silberschatz, Korth and Sudarshan

Example Queries v Find the names of all customers who have a loan, an account, or both, from the bank �customer-name (borrower) ��customer-name (depositor) v. Find the names of all customers who have a loan and an account at bank. �customer-name (borrower) ��customer-name (depositor) Database System Concepts 3. 79 ©Silberschatz, Korth and Sudarshan

Example Queries v Find the names of all customers who have a loan at the Perryridge branch. �customer-name (�branch-name=“Perryridge” (�borrower. loan-number = loan-number(borrower x loan))) v Find the names of all customers who have a loan at the Perryridge branch but do not have an account at any branch of the bank. �customer-name (�branch-name = “Perryridge” (�borrower. loan-number = loan-number(borrower �customer-name(depositor) Database System Concepts 3. 80 x loan))) – ©Silberschatz, Korth and Sudarshan

Example Queries v Find the names of all customers who have a loan at the Perryridge branch. -Query 1 �customer-name(�branch-name = “Perryridge” ( �borrower. loan-number = loan-number(borrower x loan))) � Query 2 �customer-name(�loan-number = borrower. loan-number( (�branch-name = “Perryridge”(loan)) x borrower)) Which one is more efficient? Database System Concepts 3. 81 ©Silberschatz, Korth and Sudarshan

Example Queries Find the largest account balance v Rename account relation as d v The query is: �balance(account) - �account. balance (�account. balance < d. balance (account x d (account))) Second term is all those accounts which are smaller than some account… Database System Concepts 3. 82 ©Silberschatz, Korth and Sudarshan

Assignment Operation v The assignment operation ( ) provides a convenient way to express complex queries. • Write query as a sequential program consisting of § a series of assignments § followed by an expression whose value is displayed as a result of the query. • Assignment must always be made to a temporary relation variable. v Example: Write r s as temp 1 R-S (r) temp 2 R-S ((temp 1 x s) – R-S, S (r)) result = temp 1 – temp 2 • The result to the right of the is assigned to the relation variable on the left of the . • May use variable in subsequent expressions. Database System Concepts 3. 83 ©Silberschatz, Korth and Sudarshan

Example Queries v Find all customers who have an account from at least the “Downtown” and the Uptown” branches. Query 1 �CN(�BN=“Downtown”(depositor �CN(�BN=“Uptown”(depositor account)) � account)) where CN denotes customer-name and BN denotes branch-name. Query 2 – using division �customer-name, branch-name (depositor account) ��temp(branch-name) ({(“Downtown”), (“Uptown”)}) Database System Concepts 3. 84 ©Silberschatz, Korth and Sudarshan

Example Queries v Find all customers who have an account at all branches located in Brooklyn city. �customer-name, branch-name (depositor account) ��branch-name (�branch-city = “Brooklyn” (branch)) Note the right project before the division! Database System Concepts 3. 85 ©Silberschatz, Korth and Sudarshan

Extended Relational-Algebra-Operations v Generalized Projection v Outer Join v Aggregate Functions Database System Concepts 3. 86 ©Silberschatz, Korth and Sudarshan

Generalized Projection v Extends the projection operation by allowing arithmetic functions to be used in the projection list. F 1, F 2, …, Fn(E) v E is any relational-algebra expression v Each of F 1, F 2, …, Fn are arithmetic expressions involving constants and attributes in the schema of E. v Given relation credit-info(customer-name, limit, credit-balance), find how much more each person can spend: customer-name, limit – credit-balance (credit-info) Database System Concepts 3. 87 ©Silberschatz, Korth and Sudarshan

Aggregate Functions and Operations v Aggregation function takes a collection of values and returns a single value as a result. avg: average value min: minimum value max: maximum value sum: sum of values count: number of values v Aggregate operation in relational algebra G 1, G 2, …, Gn g F 1( A 1), F 2( A 2), …, Fn( An) (E) • E is any relational-algebra expression • G 1, G 2 …, Gn is a list of attributes on which to group (can be empty) • Each Fi is an aggregate function • Each Ai is an attribute name Database System Concepts 3. 88 ©Silberschatz, Korth and Sudarshan

Aggregate Operation – Example v Relation r: g sum(c) (r) Database System Concepts A B C � � � 7 � 3 � 10 sum-C 27 3. 89 ©Silberschatz, Korth and Sudarshan

Aggregate Operation – Example v Relation account grouped by branch-name: branch-name account-number Perryridge Brighton Redwood balance A-102 A-201 A-217 A-215 A-222 400 900 750 700 branch-name g sum(balance) (account) branch-name Perryridge Brighton Redwood Database System Concepts 3. 90 balance 1300 1500 700 ©Silberschatz, Korth and Sudarshan

Aggregate Functions (Cont. ) v Result of aggregation does not have a name • Can use rename operation to give it a name • For convenience, we permit renaming as part of aggregate operation branch-name g sum(balance) as sum-balance (account) Note: branch-name is the Group-by attribute sum is the function balance is the attribute on which the function operates account is the relation expression Database System Concepts 3. 91 ©Silberschatz, Korth and Sudarshan

Database System Concepts 3. 92 ©Silberschatz, Korth and Sudarshan

Outer Join v An extension of the join operation that avoids loss of information. v Computes the join and then adds tuples form one relation that do not match tuples in the other relation to the result of the join. v Uses null values: • null signifies that the value is unknown or does not exist • All comparisons involving null are (roughly speaking) false by definition. § Will study precise meaning of comparisons with nulls later Database System Concepts 3. 93 ©Silberschatz, Korth and Sudarshan

Outer Join – Example v Relation loan-number branch-name L-170 L-230 L-260 Downtown Redwood Perryridge amount 3000 4000 1700 v Relation borrower customer-name loan-number Jones Smith Hayes Database System Concepts L-170 L-230 L-155 3. 94 ©Silberschatz, Korth and Sudarshan

Outer Join – Example v Inner Join loan Borrower loan-number branch-name L-170 L-230 Downtown Redwood amount customer-name 3000 4000 Jones Smith amount customer-name v Left Outer Join loan Borrower loan-number L-170 L-230 L-260 Database System Concepts branch-name Downtown Redwood Perryridge 3000 4000 1700 3. 95 Jones Smith null ©Silberschatz, Korth and Sudarshan

Outer Join – Example v Right Outer Join loan borrower loan-number branch-name L-170 L-230 L-155 Downtown Redwood null amount 3000 4000 null customer-name Jones Smith Hayes v Full Outer Join loan borrower loan-number branch-name L-170 L-230 L-260 L-155 Downtown Redwood Perryridge null Database System Concepts amount 3000 4000 1700 null 3. 96 customer-name Jones Smith null Hayes ©Silberschatz, Korth and Sudarshan

A two level recursive query Database System Concepts 3. 97 ©Silberschatz, Korth and Sudarshan

Null Values v It is possible for tuples to have a null value, denoted by null, for some of their attributes v null signifies an unknown value or that a value does not exist. v The result of any arithmetic expression involving null is null. v Aggregate functions simply ignore null values • Is an arbitrary decision. Could have returned null as result instead. • We follow the semantics of SQL in its handling of null values v For duplicate elimination and grouping, null is treated like any other value, and two nulls are assumed to be the same • Alternative: assume each null is different from each other • Both are arbitrary decisions, so we simply follow SQL Database System Concepts 3. 98 ©Silberschatz, Korth and Sudarshan

Null Values v Comparisons with null values return the special truth value unknown • If false was used instead of unknown, then not (A < 5) would not be equivalent to A >= 5 v Three-valued logic using the truth value unknown: • OR: (unknown or true) = true, (unknown or false) = unknown (unknown or unknown) = unknown • AND: (true and unknown) = unknown, (false and unknown) = false, (unknown and unknown) = unknown • NOT: (not unknown) = unknown • In SQL “P is unknown” evaluates to true if predicate P evaluates to unknown v Result of select predicate is treated as false if it evaluates to unknown Database System Concepts 3. 99 ©Silberschatz, Korth and Sudarshan

Modification of the Database v The content of the database may be modified using the following operations: • Deletion • Insertion • Updating v All these operations are expressed using the assignment operator. Database System Concepts 3. 100 ©Silberschatz, Korth and Sudarshan

Deletion v A delete request is expressed similarly to a query, except instead of displaying tuples to the user, the selected tuples are removed from the database. v Can delete only whole tuples; cannot delete values on only particular attributes v A deletion is expressed in relational algebra by: r r–E where r is a relation and E is a relational algebra query. Database System Concepts 3. 101 ©Silberschatz, Korth and Sudarshan

Deletion Examples v Delete all account records in the Perryridge branch. account �account – �� branch-name = “Perryridge” (account) v Delete all loan records with amount in the range of 0 to 50 loan �loan – ��amount �� 0�and amount � 50 (loan) v Delete all accounts at branches located in Needham. r 1 ���branch-city = “Needham” (account branch) r 2 ��branch-name, account-number, balance (r 1) r 3 ��customer-name, account-number (r 2 depositor) account �account – r 2 depositor �depositor – r 3 Database System Concepts 3. 102 ©Silberschatz, Korth and Sudarshan

Insertion v To insert data into a relation, we either: • specify a tuple to be inserted • write a query whose result is a set of tuples to be inserted v in relational algebra, an insertion is expressed by: r r E where r is a relation and E is a relational algebra expression. v The insertion of a single tuple is expressed by letting E be a constant relation containing one tuple. Database System Concepts 3. 103 ©Silberschatz, Korth and Sudarshan

Insertion Examples v Insert information in the database specifying that Smith has \$1200 in account A-973 at the Perryridge branch. account �{(“Perryridge”, A-973, 1200)} depositor �{(“Smith”, A-973)} • Provide as a gift for all loan customers in the Perryridge branch, a \$200 savings account. Let the loan number serve as the account number for the new savings account. r 1 �(�branch-name = “Perryridge” (borrower loan)) account ��branch-name, account-number, 200 (r 1) depositor ��customer-name, loan-number(r 1) Database System Concepts 3. 104 ©Silberschatz, Korth and Sudarshan

Updating v A mechanism to change a value in a tuple without charging all values in the tuple v Use the generalized projection operator to do this task r F 1, F 2, …, FI, (r) v Each Fi is either • the ith attribute of r, if the ith attribute is not updated, or, • if the attribute is to be updated Fi is an expression, involving only constants and the attributes of r, which gives the new value for the attribute Database System Concepts 3. 105 ©Silberschatz, Korth and Sudarshan

Update Examples v Make interest payments by increasing all balances by 5 percent. account ��AN, BAL * 1. 05 (account) where AN, BN and BAL stand for account-number, branch-name and balance, respectively. v Pay all accounts with balances over \$10, 000 6 percent interest and pay all others 5 percent account � Database System Concepts �AN, BAL * 1. 06 (� BAL � 10000 (account)) � �AN, BAL * 1. 05 (�BAL � 10000 (account)) 3. 106 ©Silberschatz, Korth and Sudarshan

Summary – operations of the relational algebra Operation Notation Purpose SELECT Selects all tuples that satisfy the selection � < selection condition> (R) condition from a relative R. PROJECT Produces a new relation with only some of the attributes of R, and removes duplicate tuples. THETA JOIN Produces all combinations of tuples from R 1 < join condition > R 2 R 1 and R 2 that satisfy the join condition. EQUIJOIN Produces all the combinations of tuples R 1 < join condition > R 2, or from R 1 and R 2 that satisfy a join R 1 (< join attributes 1>), condition with only equality comparisons. (<join attributes 2>R 2 NATURAL JOIN Same as equijoin except that the join attributes of R 2 are not included in the resulting relation; (note difference with [S]) Database System Concepts 3. 107 � < attribute list > (R) R 1 < join condition > R 2, or R 1 (< join attributes 1>), (<join attributes 2>) R 2 or R 1*R 2 ©Silberschatz, Korth and Sudarshan

Summary – operations of the relational algebra – cont. Operation UNION INTERSECTION DIFFERENCE Purpose Products a relation that includes all the tuples in R 1 or R 2 or both R 1 or R 2; R 1 and R 2 must be union compatible. Produces a relation that includes all the tuples in R 1 or R 2 or both R 1 and R 2; R 1 and R 2 must be union compatible. Produces a relation that includes all the tuples in R 1 that are not in R 2; Notation R 1 � R 2 R 1 – R 2 . CARTESIAN PRODUCT DIVISION Database System Concepts Produces a relation that has the attributes of R 1 and R 2 and includes as tuples all possible combinations of tuples from R 1 and R 2. R 1 X R 2 Produces a relation R(X) that includes all tuples t[χ] in that appear in R 1 in combination with every tuple from R 2(Y), where Z = X � Y. R 1(Z) ÷ R 2(Y) 3. 108 ©Silberschatz, Korth and Sudarshan

Tuple Relational Calculus v A nonprocedural query language, where each query is of the form {t | P (t) } v It is the set of all tuples t such that predicate P is true for t v t is a tuple variable, t[A] denotes the value of tuple t on attribute A v t r denotes that tuple t is in relation r v P is a formula similar to that of the predicate calculus Database System Concepts 3. 109 ©Silberschatz, Korth and Sudarshan

Predicate Calculus Formula 1. Set of attributes and constants 2. Set of comparison operators: (e. g. , , , ) 3. Set of connectives: and ( ), or (v)‚ not ( ) 4. Implication ( ): x y, if x if true, then y is true x y x v y 5. Set of quantifiers: t r (Q(t)) ”there exists” a tuple in t in relation r such that predicate Q(t) is true t r (Q(t)) Q is true “for all” tuples t in relation r Database System Concepts 3. 110 ©Silberschatz, Korth and Sudarshan

A Valid TRC – my definition {t 1. A, t 2. B, … tn. Z | P (t 1, t 2, …, tn+1, …, tm) } t 1. A, t 2. B, … tn. Z are tuple variables which define the output. each must be defined over a single relation, they must remain free in P, i. e not associated with quantifiers tn+1, …, tm are tuple variables which must be defined over relations, and must be bound by a quantifier. Semantics: run the free t’s on all their corresponding relations, and for each combination, check whether the P is true, if it is, output the defined output values. A variable is defined over a relation either as: t R or R(t), both syntaxes are ok and will be used. Value of a variable may be defined as t. A or t[A], both syntaxes are ok. Database System Concepts 3. 111 ©Silberschatz, Korth and Sudarshan