Functional Dependency Normalization 12142021 DB FD Normalization Ahmed
Functional Dependency & Normalization 12/14/2021 DB: FD - Normalization Ahmed Ejaz 1
Objectives n n n n n 12/14/2021 Introduction to FD + FD Examples Armstrong’s Axioms of FDs + Introduction to Normalization+ Objectives of Normalization + Normal Forms + The Process of Normalization + Un Normalized Form (UNF) + Converting UNF to Normalized Form + Summary + DB: FD - Normalization Ejaz Ahmed 2
- Introduction to FD n n A functional dependency is a constraint between two sets of attributes in a relational database. Functional dependencies are specified by the database programmer based on the intended meaning of the attributes. If X and Y are two sets of attributes in the same relation T, then X Y means that X functionally determines Y so that the values of the attributes in X uniquely determine the values of the attributes in Y means for any two tuples t 1 and t 2 in T, t 1[X] ≠ t 2[X] implies that t 1[Y] = t 2[Y] n for any two tuples t 1 and t 2 in T, t 1[X] = t 2[X] implies that t 1[Y] = t 2[Y] Means any two tuples with same Y value for different (or same) Xvalue. Otherwise, FD does not exists. n 12/14/2021 DB: FD - Normalization Ejaz Ahmed 3
-- How FD cannot be expressed using SK? Superkey (SK): A subset K (attribute) of R is a superkey of r if, in any legal relation r(R), for all tuples pairs t 1, t 2 such that t 1≠ t 2 implies t 1[K] ≠ t 2[K]. Means no any two tuples in any legal relation r(R) may have same value attribute set K (K R) Note: r(R) = r(a, b, c, …); R can be {a, b} to represent unique Examples: borrow(branch-name, loan#, customer-name, amount, city) Loan# amount Same loan is given to both husband wife. Therefore, loan# cannot be a superkey. Customer-name city Different customer lived in the same city 12/14/2021 DB: FD - Normalization Ejaz Ahmed 4
-- FD Example Given a relation r: A C (FD exists) because a 2 – c 2 a 3 – c 3 C A (FD Not exists) AB D (FD Exists, any one satisfied means FD exists). Then what about the following: AB A, AB B, BC D 12/14/2021 DB: FD - Normalization Ejaz Ahmed 5
-- FD & Keys n n Key constraint is a special kind of functional dependency Key is on LHS, all attributes are on RHS n n 12/14/2021 SSN, Name, Address For a key, no two rows share the same values, thus by default, when ever a tuple agrees on LHS it agrees on the RHS. DB: FD - Normalization Ejaz Ahmed 6
- Armstrong’s Axioms of FDs n Reflexivity: If Y X then X Y (trivial FD) n n Augmentation: If X Y then X Z YZ n n Name, Address Name If Town Zip then Town, Name Zip, Name Transitivity: If X Y and Y Z then X Z NOTE: Book reading requires 12/14/2021 DB: FD - Normalization Ejaz Ahmed 7
-- Other Derived Rules n Union: If X Y and X Z, then X YZ n n n X YX (augment), YX YZ (augment) thus. X YZ (transitive) Decomposition: If X YZ, then X Y and X Z n YZ Y (reflexive), thus X Y (transitive) n Pseudotransitivity: n Accumulation rule: n If X YZ and Z W, then X YZW n Proof: from Z W, augment with YZ to get n n 12/14/2021 If X Y and WY Z, then XW Z If X YZ and Z W, then X YZW YZZ YZW or YZ YZW By transitivity, we get X YZW DB: FD - Normalization Ejaz Ahmed 8
- Introduction to Normalization n n 12/14/2021 A process to validate and improve logical design so that it satisfies certain constraints that avoid unnecessary duplication of the data. Normalization is a formal technique for analyzing a relation based on its primary key and the FDs between the attributes of that relation. The normalization process was first proposed by Codd 1972. The normalization process takes a relation schema through a series of tests to certify whether it satisfies a certain normal form. DB: FD - Normalization Ejaz Ahmed 9
- Objectives of Normalization n n 12/14/2021 Solve problems associated with redundant data. Identify various types of update anomalies such as insertion, deletion, and modification anomalies. Recognize the appropriateness or quality of the design of relations. Use FDs to group attributes into relations that are in a known normal form. DB: FD - Normalization Ejaz Ahmed 10
- Normal Forms n The four most commonly used normal forms are: n n First Normal Form (1 NF) n Second Normal Form (2 NF) n Third Normal Form (3 NF) n Boyce-Codd Normal Form (BCNF) To convert un-normalized table to a normalized one, you first convert it to INF, then to 2 NF, then to 3 NF and then to BCNF. In other words: UNF 1 NF 2 NF 3 NF BCNF 12/14/2021 DB: FD - Normalization Ejaz Ahmed 11
-- Relationship Between Normal Forms 12/14/2021 DB: FD - Normalization Ejaz Ahmed 12
- The Process of Normalization n n 12/14/2021 Formal technique for analyzing a relation based on its primary key and the functional dependencies between the attributes of that relation. Often executed as a series of steps. Each step corresponds to a specific normal form, which has known properties. As normalization proceeds, the relations become progressively more restricted (stronger) in format and also less vulnerable to update anomalies. DB: FD - Normalization Ejaz Ahmed 13
- Unnormalized Form (UNF) n n n A table would be in unnormalized form if it contains some multivalued attributes or repeating groups. To create an unnormalized table Transform the data from the information source (e. g. form) into table format with columns and rows. Example: Dname Dnumber DMGRSSN Dlocations Research 5 222 Dammam, Khobar, Dhahran Administration 4 333 Jubail Headquarters 1 444 Khobar DEPARTMENT 12/14/2021 DB: FD - Normalization Ejaz Ahmed 14
- Converting From UNF to Normalized Form n n n 12/14/2021 1 NF + UNF to 1 NF + Example: UNF to 1 NF + 2 NF + 1 NF to 2 NF + Example: 1 NF to 2 NF + 3 NF + 2 NF to 3 NF + Example: 2 NF to 3 NF + BCNF + 3 NF to BCNF + Example: 3 NF to BCNF + DB: FD - Normalization Ejaz Ahmed 15
-- First Normal Form (1 NF) n 12/14/2021 A relation in which the intersection of each row and column contains one and only one value. DB: FD - Normalization Ejaz Ahmed 16
-- UNF to 1 NF n n n Nominate an attribute or group of attributes to act as the key for the unnormalized table. Identify the repeating group(s) in the unnormalized table which repeats for the key attribute(s). Remove the repeating group by n n n 12/14/2021 Entering appropriate data into the empty columns of rows containing the repeating data (‘flattening’ the table). Or by Placing the repeating data along with a copy of the original key attribute(s) into a separate relation. DB: FD - Normalization Ejaz Ahmed 17
-- Example: UNF to 1 NF UNF Dname Dnumber DMGRSSN Dlocations Research 5 222 Dammam, Khobar, Dhahran Administration 4 333 Jubail 1 NF 12/14/2021 Dname Dnumber DMGRSSN Dlocations Research 5 222 Dammam Research 5 222 Khobar Research 5 222 Dhahran Administration 4 333 Jubail DB: FD - Normalization Ejaz Ahmed 18
-- Second Normal Form (2 NF) n n 12/14/2021 Based on the concept of full functional dependency. Full functional dependency indicates that if n A and B are attributes of a relation, n B is fully dependent on A if B is functionally dependent on A but not on any proper subset of A. Example: In the last table {SSN, Pnumber} Hours is a full dependency. (neither SSN Hours nor Pnumber hours hold. A relation that is in 1 NF and every non-primary-key attribute is fully functionally dependent on the primary key. DB: FD - Normalization Ejaz Ahmed 19
-- 1 NF to 2 NF n n If a primary key is a single attribute then the relation is already in 2 NF. In the case of composite key, the following steps are applied to convert to 2 NF: n Identify the primary key for the 1 NF n Identify the functional dependencies in the relation n 12/14/2021 If partial dependency exists on the primary key remove them by placing them in a new relation along with a copy of their determinant. DB: FD - Normalization Ejaz Ahmed 20
--- Example: 1 NF to 2 NF FD 1 1 NF SSN Pnumber Hours Ename Plocation FD 2 Get Rid of FD 1 SSN Ename SSN Pnumber Hours Pname Plocation FD 2 2 NF 12/14/2021 SSN Ename SSN Pnumber Hours DB: FD - Normalization Get Rid of FD 2 Pnumber Pname Plocation Ejaz Ahmed 21
-- Third Normal Form (3 NF) n 3 NF is based on the concept of transitive dependency. n Transitive Dependency is a condition where n n 12/14/2021 A, B and C are attributes of a relation such that if A B and B C, then C is transitively dependent on A through B. (Provided that A is not functionally dependent on B or C). DB: FD - Normalization Ejaz Ahmed 22
-- 2 NF to 3 NF n Identify the primary key in the 2 NF relation. n Identify functional dependencies in the relation. n 12/14/2021 If transitive dependencies exist on the primary key remove them by placing them in a new relation along with a copy of their dominant. DB: FD - Normalization Ejaz Ahmed 23
-- Example: 2 NF to 3 NF EMP_DEPT Ename SSN Bdate Address Dnumber Dname DMGRSSN Transitive FD Get Rid of Transitive FD Ename SSN Bdate Address Dnumber Dname DMGRSSN DEPARTMENT EMPLOYEE 3 NF 12/14/2021 DB: FD - Normalization Ejaz Ahmed 24
-- Boyce-Codd Normal Form (BCNF) n n Based on functional dependencies that takes into account all candidate keys in a relation. For a relation with only one candidate key, 3 NF and BCNF are equivalent. A relation is in BCNF, if and only if every determinant is a candidate key. Violation of BCNF may occur in a relation that n n 12/14/2021 contains two (or more) composite keys which overlap and share at least one attribute in common. DB: FD - Normalization Ejaz Ahmed 25
-- 3 NF to BCNF n Identify all candidate keys in the relation. n Identify all functional dependencies in the relation. n 12/14/2021 If functional dependencies exists in the relation where their determinants are not candidate keys for the relation, remove the functional dependencies by placing them in a new relation along with a copy of their determinant. DB: FD - Normalization Ejaz Ahmed 26
-- Example: 3 NF to BCNF A B C D E BCNF Normalization D 12/14/2021 B A DB: FD - Normalization C Ejaz Ahmed D E 27
-- BCNF Examples n Person (SSN, Name, Address, Hobby) n n Has. Account (Account. Number, Client. Id, Office. Id) n 12/14/2021 The FD SSN Name, Address does not satisfy requirements of BCNF (since the key is (SSN, Hobby)) The FD Acct. Num Office. Id does not satisfy BCNF requirements (since keys are (Client. Id, Office. Id) and (Acct. Num, Client. Id)) DB: FD - Normalization Ejaz Ahmed 28
-- Redundancy n Suppose R has a FD A B. If an instance has 2 rows with same value in A, they must also have same value in B (=> redundancy) SSN Name, Address SSN Name Address Hobby 1111 Joe 123 Main stamps 1111 Joe 123 Main coins n If A is a superkey, there cannot be two rows with same value of A n 12/14/2021 Hence, BCNF eliminates redundancy DB: FD - Normalization Ejaz Ahmed 29
-- Another Example of BCNF According to the following set of attributes, one employee may have more than one license to drive vehicle (like for car, truck and trailer etc. ). How to apply BCNF normalization form to satisfy “if an organization requires that employee’s work permit must have any one license only” Empno, Licenseno, Ename, Address, City, WPermitno BCNF indicates WPermitno Licenseno (Although it does not hold FD) Can be managed to satisfy FD as WPermitno Licenseno Relations will be Empno, Ename, Address, City, WPermitno, Licenseno (organization must required one work permit for an employee who must have one license only) Note: It can be possible a person who has work permit (which is always one) can have more than one license for vehicles. 12/14/2021 DB: FD - Normalization Ejaz Ahmed 30
- Summary Steps in Normalization 12/14/2021 DB: FD - Normalization Ejaz Ahmed 32
- Exercises Given a relation (Prod. ID, PName, Sell. Price, Supplier. ID, SName, Buy. Price, Buy. Qty, Supplier. Locations) n Apply Normalizations and give suitable solution for Supplier. Locations of 1 NF in 3 NF. How can you describe an issue of M-N relationship-mapping for this scenario? n Apply normalization steps for the following schema? Student (Student_id, Full_name, Address, Semester_id, Course_id, Section_no) Semester (Semester_id, Course_name, Semester_start. Date, Semester_end. Date) Course (Course_id, Course_name, Section_no, grade) 12/14/2021 DB: FD - Normalization Ejaz Ahmed 33
- Slides: 32