The Relational Algebra The Relational Algebra Operations to
The Relational Algebra
The Relational Algebra – Operations to manipulate relations. – Used to specify retrieval requests (queries). – Query result is in the form of a relation. • Operations – – Unary relational operations Relational algebra operations Binary relational operations Additional relational operations • Examples of queries in relational algebra
The SELECT operation • SELECT operation s – Selects the tuples (rows) from a relation R that satisfy a certain selection condition c – Form of the operation: sc(R) – The condition c is an arbitrary Boolean expression on the attributes of R – Resulting relation has the same attributes as R – Resulting relation includes each tuple in r(R) whose attribute values satisfy the condition c
Selection Condition • A Boolean expression – Form: <attribute name><comparison op><constant value> <attribute name><comparison op><attribute name > – Comparison operators: =, <, , >, , – Constant value from the attribute domain – Can be recursively defined using operators AND, OR, and NOT • The SELECT operation is commutative.
The SELECT operation Examples: s DNO=4(EMPLOYEE) s SALARY>30000(EMPLOYEE) s((DNO=4 AND SALARY>25000) OR DNO=5)(EMPLOYEE)
The PROJECT Operation • PROJECT operation P – Keeps only certain attributes (columns) from a relation R specified in an attribute list L – Form of operation: PL(R) – Resulting relation has only those attributes of R specified in L – The PROJECT operation eliminates duplicate tuples in the resulting relation so that it remains a mathematical set (no duplicate elements)
The Project Operation Example: P FNAME, LNAME, SALARY(EMPLOYEE) Example: P SEX, SALARY(EMPLOYEE) – If several male employees have salary 30000, only a single tuple <M, 30000> is kept in the resulting relation. – Duplicate tuples are eliminated by the P operation.
Figure 6. 1 Results of SELECT and PROJECT operations.
Sequences of Operations and the RENAME Operation • Several operations can be combined to form a relational algebra expression (query) Example: Retrieve the names and salaries of employees who work in department 4: P FNAME, LNAME, SALARY(s DNO=4(EMPLOYEE)) • Alternatively, we specify explicit intermediate relations for each step: DEPT 4_EMPS s DNO=4(EMPLOYEE) R P FNAME, LNAME, SALARY(DEPT 4_EMPS)
Sequences of Operations and the RENAME Operation • Attributes can optionally be renamed in the resulting lefthand-side relation (this may be required for some operations that will be presented later): DEPT 4_EMPS s DNO=4(EMPLOYEE) R(FIRSTNAME, LASTNAME, SALARY) P FNAME, LNAME, SALARY(DEPT 4_EMPS)
Figure 6. 2 Results of relational algebra expressions.
Set Operations • Binary operations: – UNION: R S – INTERSECTION: R S – SET DIFFERENCE: R S – CARTESIAN PRODUCT: R S
Set Operations • For , , and , the operand relations R(A 1, A 2, . . . , An) and S(B 1, B 2, . . . , Bn) must have the same number of attributes, and the domains of corresponding attributes must be compatible; that is, dom(Ai) = dom(Bi) for i=1, 2, . . . , n. This condition is called union compatibility. • The resulting relation for , , or has the same attribute names as the first operand relation R (by convention). • UNION and INTERSECTION are commutative and associative. • SET DIFFERENCE is not commutative.
Figure 6. 3 Query result after the UNION operation:
Figure 6. 4 (a) Two union compatible relations. (b) STUDENT INSTRUCTOR. (c) STUDENT INSTRUCTOR. (d) STUDENT INSTRUCTOR. (e) INSTRUCTOR STUDENT.
Set Operations • CARTESIAN PRODUCT: Q(A 1, A 2, . . . , Am, B 1, B 2, . . . , Bn) R(A 1, A 2, . . . , Am) S(B 1, B 2, . . . , Bn) – A tuple t exists in R for each combination of tuples t 1 from R and t 2 from S such that: t[A 1, A 2, . . . , Am]=t 1 and t[B 1, B 2, . . . , Bn]=t 2 – If R has m tuples and S has n tuples, then Q will have m*n tuples. – CARTESIAN PRODUCT is a meaningless operation on its own. It can combine related tuples from two relations if followed by the appropriate SELECT operation.
Set Operations Example: Combine each DEPARTMENT tuple with the EMPLOYEE tuple of the manager. DEP_EMP DEPARTMENT EMPLOYEE DEPT_MANAGER s. MGRSSN=SSN(DEP_EMP)
Figure 6. 5 The CARTESIAN PRODUCT operation.
JOIN Operations • THETA JOIN: – Similar to a CARTESIAN PRODUCT followed by a SELECT. The condition c is called a join condition. R c S – c is of the form: <condition> [AND <condition>]* – <condition>: Ai Bj where is a comparator
JOIN Operations • EQUIJOIN: – The join condition c includes one or more equality comparisons involving attributes from R and S. That is, c is of the form: (Ai=Bj) AND. . . AND (Ah=Bk); 1<i, h<m, 1<j, k<n – Ai, . . . , Ah are called the join attributes of R – Bj, . . . , Bk are called the join attributes of S
JOIN Operations Example of using EQUIJOIN: • Retrieve each DEPARTMENT's name and its manager's name: – T DEPARTMENT MGRSSN=SSN EMPLOYEE – RESULT P DNAME, FNAME, LNAME(T)
JOIN Operations • NATURAL JOIN (*): – In an EQUIJOIN Q R c S, the join attribute of S appear redundantly in the result relation R. In a NATURAL JOIN, the redundant join attributes of S are eliminated from Q. The equality condition is implied and need not be specified. Q R *(join attributes of R), (join attributes of S) S
JOIN Operations Example: Retrieve each EMPLOYEE's name and the name of the DEPARTMENT he/she works for: T EMPLOYEE *(DNO), (DNUMBER) DEPARTMENT RESULT P FNAME, LNAME, DNAME(T)
JOIN Operations • If the join attributes have the same names in both relations, they need not be specified and we can write Q R * S. Example: Retrieve each EMPLOYEE's name and the name of his/her SUPERVISOR: – SUPERVISOR(SUPERSSN, SFN, SLN) P SSN, FNAME, LNAME(EMPLOYEE) – T EMPLOYEE * SUPERVISOR – RESULT P FNAME, LNAME, SFN, SLN(T)
Figure 6. 6 Illustrating the JOIN operation.
Figure 6. 7 An illustration of the NATURAL JOIN operation. (a) PROJ_DEPT PROJECT * DEPT. (b) DEPT_LOCS DEPARTMENT * DEPT_LOCATIONS.
JOIN Operations • Note: In the original definition of NATURAL JOIN, the join attributes were required to have the same names in both relations. • There can be a more than one set of join attributes with a different meaning between the same two relations.
JOIN Operations For example: JOIN ATTRIBUTES RELATIONSHIP EMPLOYEE. SSN= DEPARTMENT. MGRSSN EMPLOYEE manages the DEPARTMENT EMPLOYEE. DNO= DEPARTMENT. DNUMBER EMPLOYEE works for the DEPARTMENT
JOIN Operations For example: Retrieve each EMPLOYEE's name and the name of the DEPARTMENT he/she works for: T EMPLOYEE DNO=DNUMBER DEPARTMENT RESULT P FNAME, LNAME, DNAME(T)
JOIN Operations • A relation can have a set of join attributes to join it with itself: JOIN ATTRIBUTES RELATIONSHIP EMPLOYEE(1). SUPERSSN= EMPLOYEE(2) supervises EMPLOYEE(2). SSN EMPLOYEE(1) – One can think of this as joining two distinct copies of the relation, although only one relation actually exists – In this case, renaming can be useful
JOIN Operations Example: Retrieve each EMPLOYEE's name and the name of his/her SUPERVISOR: – SUPERVISOR(SSSN, SFN, SLN) P SSN, FNAME, LNAME(EMPLOYEE) – T EMPLOYEE SUPERSSN=SSSN SUPERVISOR – RESULT P FNAME, LNAME, SFN, SLN(T)
Complete Set of Relational Algebra Operations: • All the operations discussed so far can be described as a sequence of only the operations SELECT, PROJECT, UNION, SET DIFFERENCE, and CARTESIAN PRODUCT. • Hence, the set {s , P, , , } is called a complete set of relational algebra operations. Any query language equivalent to these operations is called relationally complete. • For database applications, additional operations are needed that were not part of the original relational algebra. These include: 1. Aggregate functions and grouping. 2. OUTER JOIN and OUTER UNION.
The DIVISION Operation • The DIVISION operation – As a sequence of operations T 1 P Y(R) T 2 P Y((S T 1) R) T T 1 T 2
Figure 6. 8 Illustrating the division operation. (a) Dividing SSN_PNOS by SMITH_PNOS. (b) T R S.
Additional Relational Operations • AGGREGATE FUNCTIONS – Functions such as SUM, COUNT, AVERAGE, MIN, MAX are often applied to sets of values or sets of tuples in database applications. – <grouping attributes> F<function list> (R) – The grouping attributes are optional.
Additional Relational Operations Example 1: Retrieve the average salary of all employees (no grouping needed): R(AVGSAL) F AVERAGE SALARY (EMPLOYEE) Example 2: For each department, retrieve the department number, the number of employees, and the average salary (in the department): R(DNO, NUMEMPS, AVGSAL) DNO F COUNT SSN, AVERAGE SALARY (EMPLOYEE) DNO is called the grouping attribute in the above example.
Figure 6. 9 An illustration of the AGGREGATE FUNCTION operation. (a)
OUTER JOIN • In a regular EQUIJOIN or NATURAL JOIN operation, tuples in R or S that do not have matching tuples in the other relation do not appear in the result. • Some queries require all tuples in R (or S or both) to appear in the result. • When no matching tuples are found, nulls are placed for the missing attributes
OUTER JOIN • LEFT OUTER JOIN: – R S – Every tuple in R appears in the result • RIGHT OUTER JOIN: – R S – Every tuple in S appears in the result • FULL OUTER JOIN: – R S – Every tuple in R or S appears in the result
Figure 6. 11 The LEFT OUTER JOIN operation.
Examples of Queries • Q 1: “Retrieve the name and address of all employees who work for the ‘Research’ department. ” RESEARCH_DEPT DNAME=‘RESEARCH’(DEPARTMENT) RESEARCH_EMPS (RESEARCH_DEPT DNUMBER=DNO EMPOLYEE RESULT FNAME, LNAME, ADDRESS(RESEARCH_EMPS)
Examples of Queries • Q 2: “For every project in ‘Stafford’, list the project number, the controlling department number, and the department manager’s last name, and address. ” STAFFORD_PROJS PLOCATION=‘STAFFORD’(PROJECT) CONTR_DEPT (STAFFORD_PROJS DNUM=DNUMBER DEPARTMENT) PROJ_DEPT_MGR (CONTR_DEPT MGRSSN=SSN EMPLOYEE) RESULT PNUMBER, DNUM, LNAME, ADDRESS(PROJ_DEPT_MGR)
Examples of Queries • Q 3: “List the names of managers who have at least one dependent. ” MGR(SSN) MGRSSN(DEPARTMENT) EMPS_WITH_DEPS(SSN) ESSN(DEPENDENT) MGRS_WITH_DEPS (MGRS EMPS_WITH_DEPS) RESULT LNAME, FNAME(MGRS_WITH_DEPS * EMPLOYEE)
Examples of Queries • Q 4: “Retrieve the names of employees who have no dependent. ” ALL_EMPS SSN(EMPLOYEE) EMPS_WITH_DEPS(SSN) ESSN(DEPENDENT) EMPS_WITHOUT_DEPS (ALL_EMPS EMPS_WITH_DEPS) RESULT LNAME, FNAME(EMPS_WITHOUT_DEPS * EMPLOYEE)
Examples of Queries • Q 5: “List the names of all employees with two or more dependents. ” T 1(SSN, NO_OF_DEPTS) ESSN F COUNT DEPENDENT_NAME(DEPENDENT) T 2 NO_OF_DEPS 2(T 1) RESULT LNAME, FNAME(T 2 * EMPLOYEE)
Examples of Queries • Q 6: “Find the name of employees who work on all the projects controlled by the department number 5. ” DEPT 5_PROJS(PNO) PNUMBER( DNUM=5(PROJECT)) EMP_PROJ(SSN, PNO) ESSN, PNO(WORKS_ON) RESULT_EMP_SSNS EMP_PROJ DEPT 5_PROJS RESLUT LNAME, FNAME(RESULT_EMP_SSNS * EMPLOYEE)
Examples of Queries • Q 7: “Make a list of project numbers for projects that involve an employee whose last name is ‘Smith’, either as a worker or as a manager of the department that controls the project. ” SMITHS(ESSN) SSN( LNAME=‘SMITH’(EMPLOYEE)) SMITH_WORKER_PROJ PNO(WORKS_ON * SMITHS) MGRS LNAME, DNUMBER(EMPLOYEE SSN=MGRSSN DEPARTMENT) SMITH_MANAGED_DEPTS(DNUM) DNUMBER( LNAME=‘SMITH’(MGRS)) SMITH_MGR_PROJS(PNO) PNUMBER(SMITH_MANAGED_DEPTS * PROJECT) RESULT (SMITH_WORKER_PROJS SMITH_MGR_PROJS)
- Slides: 47