CSC 453 Database Systems Lecture Tanu Malik College
CSC 453 Database Systems Lecture Tanu Malik College of CDM De. Paul University
Today • • Review‐Normalization‐BCNF Recursion PL/SQL
Prime/Non‐prime attributes • A BC, B C, D B • AB C, C D, CE A
• Consider the following relations R(Sname, Semail, Course, Instructor) Gwen gwen@depaul. edu CSC 453 Malik Gwen gwen@depaul. edu CSC 443 Riely Priya priya@gmail CSC 453 Malik Students(Semail, Sname) Courses(Course, Instructor) Taking(student. Email, Course)
R(Sname, Semail, Course, Instructor) Gwen gwen@depaul. edu CSC 453 Malik Gwen gwen@depaul. edu CSC 443 Riely Priya priya@gmail CSC 453 Malik Maya maya@cti. depaul CSC 443 Pitcher Students(SEmail, Sname) Courses(Course, Instructor) Taking(SEmail, Course)
Motivation for Normal Forms • Identify a “good” schema – For some definition of good – Avoid anomalies, redundancy • Many normal forms – 1 st – 2 nd – 3 rd – BCNF
First Normal Form (1 NF) • No multi‐valued attributes allowed • Counter‐example – Course(name, instructor, [student, email]*) – Redundancy in non‐list attributes 7
Removing Multi‐Valued Attributes 1. Remove the multi‐valued attribute from the relation 2. Create a new relation with the primary key of the original relation and the multi‐valued attribute 3. For each tuple in original relation with k values, add k tuples to the new relation 4. Primary key of new relation contains all attributes; primary key of original relation becomes foreign key in new relation referencing original relation 8
Second Normal Form (2 NF) • Second Normal Form: 1 NF, plus every non‐ prime attribute in the relation is determined by the entire primary key (but not by any subset) • To get 2 NF, eliminate partial dependencies on the primary key • X Y is a partial dependency if Z Y for some subset Z of X 9
Converting 1 NF to 2 NF • (Emp. ID, Emp. LName, Emp. FName, Dept, Proj. Code, Hours) • Functional Dependencies – Emp. ID Emp. LName, Emp. FName, Dept – Emp. ID, Proj. Code Hours 10
1 NF vs 2 NF Data Emp. ID Emp. LName Emp. FName Dept Proj. Code Hours 550 Smith Winston Accounting 101 20 550 Smith Winston Accounting 252 10 601 Smith Barney Finance 252 5 390 Hammond Evey Personnel 995 25 001 Preston Bill Special Events 995 5 100 Logan Ted Special Events 995 5 007 Bond James Personnel 252 20 505 Lane Lois Media Relations 101 10 505 Lane Lois Media Relations 252 10 505 Lane Lois Media Relations 995 10 11
Converting 1 NF to 2 NF • Decompose (Emp. ID, Emp. LName, Emp. FName, Dept, Proj. Code, Hours) • Remove the offending FD – Emp. ID Emp. LName, Emp. FName, Dept • Two tables – (Emp. ID, Emp. LName, Emp. FName, Dept) – (Emp. ID, Proj. Code, Hours) 12
Towards 2 NF Emp. ID Emp. LName Emp. FName Dept Emp. ID Proj. Code Hours 550 Smith Winston Accounting 550 101 20 550 Smith Winston Accounting 550 252 10 601 Smith Barney Finance 601 252 5 390 Hammond Evey Personnel 390 995 25 001 Preston Bill Special Events 001 995 5 100 995 5 007 252 20 505 101 10 505 252 10 505 995 10 100 Logan Ted Special Events 007 Bond James Personnel 505 Lane Lois Media Relations 13
Towards 2 NF Emp. ID Emp. LName Emp. FName Dept Emp. ID Proj. Code Hours 550 Smith Winston Accounting 550 101 20 601 Smith Barney Finance 550 252 10 390 Hammond Evey Personnel 601 252 5 001 Preston Bill Special Events 390 995 25 001 995 5 100 995 5 007 252 20 505 101 10 505 252 10 505 995 10 100 Logan Ted Special Events 007 Bond James Personnel 505 Lane Lois Media Relations Emp. ID Emp. LName, Emp. FName, Dept Emp. ID, Proj. Code Hours 14
• Counterexample • Movies(title, year, studio, studio. Address, salary) • FD: title, year ‐> studio; studio ‐> studio. Address; star‐>salary
Removing Partial Dependencies • Find all dependencies where a subset of the primary key determines some non‐prime attribute(s) Starting with the smallest subset, do the following: 1. Remove all attributes on the right‐hand side from the relation and put them in a new relation 2. Add the attributes in the determinant(l. h. s) to the new relation; make them the primary key, and make them a foreign key in the original relation referencing the new table 3. Remove from any remaining partial dependencies any attributes removed from the original relation 16
Third Normal Form (3 NF) • Third Normal Form: 2 NF, plus every non‐ prime attribute in the relation is determined only by the primary key of the relation. • To get 3 NF, eliminate transitive dependencies on the primary key • X Y is a transitive dependency if X Z and Z Y for some Z that is disjoint from X 17
• Counterexample • Movies(title, year, studio, studio. Address, salary) • FD: title, year ‐> studio; studio ‐> studio. Address; star‐>salary
Converting 2 NF to 3 NF • Schema: (First , Last, Address, City, State, Zip) • First, Last Address, City, State, Zip • Transitive functional dependency: – Zip City, State 19
2 NF Data Table First Last Address City State Zip. Code Henry Bienen 2145 Sheridan Rd Evanston IL 60202 Helmut Epp 55 E. Jackson St Chicago IL 60604 Denis Stein 55 E. Jackson St. Chicago IL 60604 David Miller 243 S. Wabash Av. Chicago IL 60604 Gary Miller 5000 Forbes Av. PA 15213 Gary Leighton 77 Beacon St Cambridge MA 02139 Pittsburgh 20
Decompose the Tables First Last Address Zip. Code Henry Bienen 2145 Sheridan Rd 60202 Helmut Epp 55 E. Jackson St 60604 55 E. Jackson St. 60604 Denis Stein David Miller 243 S. Wabash Av. 60604 Gary Miller 5000 Forbes Av. 15213 Gary Leighton 77 Beacon St City State Zip. Code Evanston IL 60202 Chicago IL 60604 Pittsburgh PA 15213 Cambridge MA 02139 21
Decompose the Tables First Last Address Zip. Code Henry Bienen 2145 Sheridan Rd 60202 Helmut Epp 55 E. Jackson St 60604 55 E. Jackson St. 60604 Denis Stein David Miller 243 S. Wabash Av. 60604 Gary Miller 5000 Forbes Av. 15213 Gary Leighton 77 Beacon St City State Zip. Code Evanston IL 60202 Chicago IL 60604 Pittsburgh PA 15213 Cambridge MA 02139 Zip. Code City, State First, Last, Address, Zip. Code 02139 22
3 NF Decomposition • Input: A universal relation R and a set of functional dependencies F on R • Output: A decomposition D of R into 3 NF schemas with dependency preservation and nonadditive join 23
3 NF‐Normalization Algorithm (3 NF Normalization): Input: Relation R with FDs F c Output: 3 NF decomposition D of R 1. D = {} 2. For every X Y in F add sub‐relation Q =(XY) to D, unless a) some sub‐relation in D already contains all of XY: don’t add Q b) some sub‐relation(S) in D is contained in XY: replace S with Q(XY) 4. If no relation in D contains a key of R, then add new relation Q(X) on some key X of R
Example • R = (A, B, C, D) A→ C D BA→ C The candiate key is ? 3 NF decomposition is ?
3 NF Decomposition • R(A, B, C, D) – FDs: A B, C D • R(A, B, C, D, E) • FDs: C E, B C • R(A, B, C, D, E, F) • F = { AB CD, D A, C EF} 26
3 NF Decomposition • R={A, B, C, D, E, F} • F = { AB CD, C EF, D A } • Non‐prime determines a prime attribute 27
3 NF‐Checking Order • • • Is F in minimal cover? What is the candidate key? Which FDs violate 3 NF? Decompose to 3 NF. Is the decomposition lossless? Is it dependency preserving?
The End Result • A collection of relations, each in 3 NF – Each relation has a primary key • (We are assuming that there is only one candidate key…) – Every non‐prime attribute in a relation is determined by its entire primary key – No non‐prime attribute in a relation is determined by any attributes other than its entire primary key • Information can reconstructed using joins, and stored in views if desired 29
Boyce‐Codd Normal Form • Boyce‐Codd Normal Form (BCNF): For every non‐trivial functional dependency X A, it must be the case that X is a superkey – “Every determinant must contain a candidate key” – X must be a superkey even if A is a prime attribute 30
BCNF example Pizza |Topping Type |Topping ‐‐‐‐‐‐‐ |‐‐‐‐‐‐‐‐‐‐‐‐‐ 1 | cheese | mozzarella 1 | meat | pepperoni 1 | vegetable| olives 2 | meat | sausage 2 | cheese | mozzarella 2 | vegetable| peppers Each pizza must have exactly one of each topping type • One type of cheese • One type of meat • One type of vegetable
BCNF example Pizza |Topping Type |Topping ‐‐‐‐‐‐‐ |‐‐‐‐‐‐‐‐‐‐‐‐‐ 1 | cheese | mozzarella 1 | meat | pepperoni 1 | vegetable| olives 2 | meat | feta 2 | cheese | chicken 2 | vegetable| peppers Pizza can have exactly 3 types of topping • One type of cheese • One type of meat • One type of vegetable
Decompose Pizza |Topping | ‐‐‐‐‐‐‐ |‐‐‐‐‐‐‐| 1 |mozzarella| 1 |pepperoni | 1 |olives | 2 |mozzarella | 2 |sausage | 2 |peppers | Topping |Topping Type ‐‐‐‐‐‐‐|‐‐‐‐‐‐‐ mozzarella|cheese pepperoni |meat olives |vegetable mozzarella | cheese sausage |meat peppers |vegetable
BCNF Decomposition • Input: A universal relation R and a set of functional dependencies F on R • Output: A decomposition D of R into BCNF schemas with nonadditive join – Algorithm on next page – Algorithm does not guarantee dependency preservation 34
BCNF • Decomposition Algorithm ALGORITHM BCNF (R: Relation, F: FD set) BEGIN 1. D {R} 3. While some X → Y holds in some Ri(A 1, …, An) in D and (X → Y) is not trivial, X is not a superkey of Ri Ri 1 X+ ∩({A 1, …, An}) Ri 2 X �({A 1, …, An} - X+ ) Result – {Ri} �{Ri 1, Ri 2} 4. Return result END 35
BCNF Example: R = (A, B, C) F = {A → B, B → C} Is R in BCNF? A: Consider the nontrivial dependencies in F: 1. A → B, 2. B → C, A → R (A is a key) B → A (B is not a key) Therefore, R not in BCNF 36
BCNF Example: R = R 1 �R 2 R 1 = (A, B); R 2 = (B, C) F = {A → B, B → C} Are R 1, R 2 in BCNF? A: 1. Test R 1: A → B covered, A → R 1 (all other FD’s covered trivial) 2. Test R 2: B → C covered, B → R 2 (all other FD’s covered trivial) R 1, R 2 in BCNF Q: Is the decomposition lossless? 37
BCNF R = (A, B, C, D, E, H) F = {A → BC, E → HA} Decompose R into BCNF: 38
BCNF Decomposition R = (A, B, C, D, E, H) F = {A → BC, E → HA} (Note: Fc = F) Decomposition #1: R = R 1 �R 3 �R 4 R = (A, B, C, D, E, H) Decompose on A → BC R 1 = (A, B, C) Q: Is this DP? A: R 2 = (A, D, E, H) Decompose on E → HA R 3 = (A, E, H) R 4 = (D, E) Yes. All Fc covered by R 1, R 3, R 4. Therefore F+ covered 39
BCNF Decomposition • R(A, B, C, D, E, F) • F = { AB CD, C EF, D A } 40
BCNF Decomposition • R (S, P, Q, X, Y, N, C) • F = { S NC, P XY, SP Q , Q P } • Decompose to BCNF • Is it dependency preserving? 41
Properties of Decompositions • When we work with BCNF, we must look at properties involving multiple relations: – Nonadditive (Lossless) Join: No tuples that are not in the original relation (spurious tuples) are generated when decomposed relations are joined – Dependency Preservation: Every functional dependency in the original relation is represented somewhere in the decomposition 42
BCNF vs. 3 NF • Every relation in BCNF is in 3 NF • Not every relation in 3 NF is in BCNF – 3 NF relations that are not in BCNF fail because some prime attribute is determined by something that is not a superkey – this is allowed by 3 NF but not by BCNF • Decomposing tables into BCNF can be tricky – functional dependencies can be lost! 43
Remarks on Algorithms • Different runs may yield different results, depending on the order in which attributes and functional dependencies are considered • We must know all functional dependencies • We can’t always guarantee dependency preservation for BCNF, but we can generate a 3 NF decomposition and then consider the individual relations in the result 44
PL/SQL • A general‐purpose procedural programming that includes SQL commands • PL/SQL can – create and issue SQL statements – store and process the results of queries – define procedures to respond to database events 45
Basic Structure of Code • Simplest form is an anonymous block: declare begin ‐‐ variable and subprogram declarations ‐‐ every statement must end with a ; ‐‐ PL/SQL statements to execute ‐‐every statement must end with a ; ‐‐statements can be nested with another B/E exception ‐‐ exception handling (optional) end; 46
Output • To display output: RAISE NOTICE ‘string %’, arguments; • Output buffer displayed in DBMS Output tab – Use View Dbms Output and ‘+’ to open tab • Single line comments with – • Multi‐line with /* */ 47
Data Types • • • Numeric Character Boolean Datetime Data types are not case sensitive DECLARE num 1 INTEGER; num 2 REAL; num 3 DOUBLE PRECISION; BEGIN null; END; /
Declaring Variables • All variables must be declared: var. Name [CONSTANT] data. Type [NOT NULL] [: = initial. Value]; • Assignments use : =, and PL/SQL has typical arithmetic operations 49
Scoping DECLARE ‐‐ Global variables num 1 number : = 95; num 2 number : = 85; BEGIN dbms_output. put_line('Outer Variable num 1: ' || num 1); dbms_output. put_line('Outer Variable num 2: ' || num 2); DECLARE ‐‐ Local variables num 1 number : = 195; num 2 number : = 185; BEGIN dbms_output. put_line('Inner Variable num 1: ' || num 1); dbms_output. put_line('Inner Variable num 2: ' || num 2); END; /
Declaring Variables • Only one variable can be declared per line, but variable types can be given in terms of the domain of another variable or attribute: var. Name other. Var%type; var. Name TABLE. Attribute%type; 51
Operators • Arithmetic operators • Relational operators • Comparison operators – LIKE, BETWEEN, IS NULL • Logical operators • String operators
Branching • if‐then: if condition then …’true’ statements… end if; • if‐else: if condition then …’true’ statements… else …’false’ statements… end if; 53
Branching • if‐elsif: if condition 1 then … ‘true’ statements… elsif condition 2 then … ‘false-true’ statements… elsif condition 3 then … ‘false-true’ statements… (… as many times as needed…) else … ‘all false’ statements… end if; 54
Case Statement CASE [ expression ] WHEN condition_1 THEN result_1 WHEN condition_2 THEN result_2. . . WHEN condition_n THEN result_n ELSE result END
Case Statement • expression – Optional. It is the value that you are comparing to the list of conditions. (ie: condition_1, condition_2, . . . condition_n) • condition_1, condition_2, . . . condition_n – The conditions that must all be the same datatype. The conditions are evaluated in the order listed. Once a condition is found to be true, the CASE statement will return the result and not evaluate the conditions any further. • result_1, result_2, . . . result_n – Results that must all be the same datatype. This is the value returned once a condition is found to be true.
Loops • General loop: loop …loop body… end loop; • Repeats until exit; is executed in loop body • While loop: while condition loop …loop body… end loop; • Repeats until condition is false 57
Loops • For loop: for variable in [reverse] lower. . upper loop …loop body… end loop; • Can only increment/decrement by one • lower always appears before upper in header 58
Incorporating SQL Queries • Result of a query can be stored in a set of variables by adding INTO clause to query: SELECT list of attributes INTO list of variables FROM list of tables … • Variable types must match attribute types 59
Procedures (In Oracle) CREATE [OR REPLACE] PROCEDURE name IN [OUT] param. Type …) AS …declarations… BEGIN …body of procedure… END; / (param. Name • ‘IN’ parameters are passed by value, for input only, read‐only parameters • ‘OUT’ parameters are passed by reference • ‘IN OUT’ parameters are passed by reference, to return results to the calling sub‐program 60
Functions CREATE [OR REPLACE] FUNCTION mode {IN|OUT|INOUT} name …) RETURNS return. Type AS $$ …declarations… BEGIN …body of function… return. Value; END; $$ language plpgsql; • ‘IN’ parameters are default • Specify return type and return value instead 61
Executing Procedures and Functions • A standalone procedure – Using the EXECUTE keyword – Calling the name of the procedure from a PL/SQL block • A standalone function – Calling the name of the function from a PL/SQL block – Calling the name of the function in a SQL query
Cursors • A cursor represents a pointer into a set of records returned by a query declare name cursor for query; • cursor name can be used to iterate through the records returned by query 63
Cursor Commands/Expressions • open name; ‐‐ initializes to beginning of set • fetch name into variable. List; ‐‐ reads the next record into the variables • close name; ‐‐ closes the cursor 64
Parameterized Cursors • Can supply a parameter in cursor declaration and query declare name (parameter in type) cursor for query; • Each time cursor is opened, value of parameter is specified in parentheses to complete the query 65
Implicit Cursors for DML statements • Whenever a DML statement (INSERT, UPDATE and DELETE) is issued, an implicit cursor is associated with this statement. • INSERT operations: the cursor holds the data that needs to be inserted. • UPDATE and DELETE operations: the cursor identifies the rows that would be affected.
Records • Data structure to hold data items of different kinds • Table‐based Records: – Can create a record with same structure as the row of a table (fields are table attributes): record. Name TABLE%rowtype; – Can select a row of a table directly into a record, and access individual fields with record. Name. Attribute 67
Records • Cursor‐based records: Assign rowtype from a query in cursor • User‐defined Records: – Declare a new data type and a table of records: create type new. Type ( attr 1 datatype, attr 2 datatype )
Exceptions DECLARE <declarations section> BEGIN <executable command(s)> EXCEPTION <exception handling goes here > WHEN exception 1 THEN exception 1‐handling‐statements WHEN exception 2 THEN exception 2‐handling‐statements WHEN exception 3 THEN exception 3‐handling‐statements. . . . WHEN others THEN exception 3‐handling‐statements END;
Case Statement SELECT Last. Name, First. Name, (CASE Career WHEN 'UGRD' THEN 'Undergraduate' WHEN 'GRD' THEN 'Graduate' WHEN ' SAL' THEN 'Student At Large' END) AS Career FROM student; UPDATE employee SET salary = (CASE WHEN salary < 50000 THEN 50000 WHEN salary < 100000 THEN salary * 1. 05 ELSE salary * 1. 1 END);
- Slides: 70