Theory Practice Methodology of Relational Database Design and
Theory, Practice & Methodology of Relational Database Design and Programming Copyright © Ellis Cohen 2002 -2008 Basic Normal Forms These slides are licensed under a Creative Commons Attribution-Non. Commercial-Share. Alike 2. 5 License. For more information on how you may use them, please see http: //www. openlineconsult. com/db © Ellis Cohen 2001 -2008
Overview of Lecture Normal Forms 1 st Normal Form 2 nd Normal Form Conceptual 2 NF Normalization PURE 3 rd Normal Form MIXED 3 rd Normal Form Factoring & MIXED 3 rd Normal Form Normalization with Nested Determinants Normalization with Overlapping Determinants © Ellis Cohen 2001 -2008 2
Normal Forms – 1 NF, 2 NF, 3 NF, BCNF, 4 NF, 5 NF(PJNF) – Property of a table From 2 NF on – Each one defines a new kind of non-redundancy requirement – Each one requires the previous one To deal with a relation that fails to satisfy a particular normal form: – the relation is decomposed in a particular way © Ellis Cohen 2001 -2008 3
1 st Normal Form © Ellis Cohen 2001 -2008 4
1 NF: 1 st Normal Form All fields in a relation must be atomic Emps empno … kids 40694 … { Sam } 15129 … 30276 … { Willlam, Norman } 81107 … { Josh, Jake } 60019 … 1 NF is not about Redundancy, but about ensuring that entity classes / relations are in a form that we can easily normalize (A form of normalization can actually be defined for relations with non-atomic field, but it is more complicated) © Ellis Cohen 2001 -2008 5
Relational 1 NF Normalization Normalize by 1) Remove multivalued attribute from relation 2) Create new relation with composite key: single-valued version of attribute + original relation's primary key Empsold Empsnew empno ename *kids empno ename empno childnam Employees empno What's the corresponding conceptual normalization? Children ename Children empno childnam 40694 … 40694 Sam 15129 … 30276 Willlam 30276 … 30276 Norman 81107 … 81107 Josh 60019 … 81107 Jake © Ellis Cohen 2001 -2008 6
Conceptual 1 NF Normalization Normalize By 1) Remove multivalued attribute from entity class 2) Add a weak entity class dependent upon the original class with the single-valued version of the attribute Employee empno ename *kids Multivalued has Employee Child empno ename © Ellis Cohen 2001 -2008 childnam 7
2 nd Normal Form © Ellis Cohen 2001 -2008 8
2 NF: 2 nd Normal Form No partial key dependencies: The determinants of a non-prime attribute CANNOT consist of only part of a candidate key Entries( invid, linenum, customer, prodid ) invid customer Entries invid linenum customer prodid 30492 17 JONES 6171 30493 1 CHEN 4059 30493 2 CHEN 6171 30493 3 CHEN 2184 30494 1 SONI 3792 prime attributes non-prime attributes © Ellis Cohen 2001 -2008 9
2 NF Violation Diagram: Partial Key Violation Determinant(s) invid Candidate key linenum prime attributes customer non-prime attributes © Ellis Cohen 2001 -2008 10
Comparing NF Diagrams Determinant(s) invid Candidate key linenum prime attributes 2 NF Violation customer prodid In 2 NF non-prime attributes © Ellis Cohen 2001 -2008 11
Relational 2 NF Normalization Normalize by 1) Create a new relation with the determinant and everything dependent upon it 2) Remove everything dependent upon the determinant from the original relation 3) Add foreign key constraint, possibly w cascading delete Entriesold Entriesnew invid linenum customer prodid invid linenum prodid Invoices invid customer Entries invid customer invid linenum prodid Invoices 30492 17 6171 invid 30493 1 4059 30492 JONES 30493 2 6171 30493 CHEN 30493 3 2184 30494 SONI 30494 1 3792 © Ellis Cohen 2001 -2008 customer 12
Corresponding Conceptual Model Entries Invoices invid linenum prodid invid customer Entry Invoice linenum prodid © Ellis Cohen 2001 -2008 invid customer 13
Normalization Exercise Original Table: Dog( ssno, dogname, dogdob, oaddr, ophone ) – ssno soc sec# of dog's owner – dogname of dog – dogdob date of birth of dog – oaddr of the owner – ophone # of the owner 1) 2) 3) 4) What are the candidate keys? What are the prime attributes? What FD's are determined by a candidate key? What FD's are not determined by a candidate key? © Ellis Cohen 2001 -2008 14
Normalization Answer: FD's Candidate Keys ssno + dogname Prime Attributes ssno, dogname Candidate Key FD's ssno + dogname dogdob Non Candidate Key FD's ssno oaddr, ophone Normalize! © Ellis Cohen 2001 -2008 15
Normalization Answer: Design Dogs Owners ssno dogname dogdob ssno oaddr ophone What's the corresponding conceptual model? © Ellis Cohen 2001 -2008 16
Normalized Conceptual Model Dogs Owners ssno dogname dogdob ssno oaddr ophone Dog Owner dogname dogdob © Ellis Cohen 2001 -2008 ssno oaddr ophone 17
Conceptual 2 NF Normalization © Ellis Cohen 2001 -2008 18
Conceptual 2 NF Normalization Partial key violations can be seen and fixed during Conceptual Modeling. Given the following entity class and FD, how should the entity class be decomposed? invid customer Entry invid linenum customer prodid © Ellis Cohen 2001 -2008 19
Conceptual 2 NF Normalization Normalize By 1) Move determinant and everything dependent on it to a new entity class. 2) The original class becomes a weak entity class, with the remaining parts of the key as the discriminator invid customer Entry What's the corresponding relational normalization? invid linenum customer prodid invid + linenum prodid Entry invid customer Invoice invid customer linenum prodid © Ellis Cohen 2001 -2008 20
Relational 2 NF Normalization Normalize by 1) Create a new relation with the determinant and everything dependent upon it 2) Remove everything dependent upon the determinant from the original relation 3) Add foreign key constraint, possibly w cascading delete Entriesold invid linenum customer prodid Entriesnew invid linenum prodid Invoices invid customer invid customer © Ellis Cohen 2001 -2008 21
Another Normalization Exercise In many cases, the entire normalization can be done using the conceptual model Original Entity Class Assign( empno, ename, pno, pname, taskid, tname, startdate, hrs. Per. Week ) Minimal Non-Candidate Key FD's empno + pno startdate empno ename Start by doing 2 NF pno pname conceptual normalization taskid tname based on this FD © Ellis Cohen 2001 -2008 22
Initial Normalization Step empno + taskid hrs. Per. Week Assign owned by empno + pno startdate Proj. Assign taskid tname hrs. Per. Week empno ename pno pname startdate Now normalize Assign based on taskid tname This is also 2 NF, because taskid is a discriminator, and only part of the primary key © Ellis Cohen 2001 -2008 23
Normalization by Task taskid tname empno + taskid hrs. Per. Week Task Assign taskid tname empno + pno startdate Proj. Assign hr. Per. Week empno ename pno pname startdate Now normalize based on empno ename © Ellis Cohen 2001 -2008 24
Normalization by Employee taskid tname Task taskid tname empno + taskid hrs. Per. Week Assign Employee empno ename Proj. Assign hr. Per. Week pno pname startdate Now normalize based on pno pname © Ellis Cohen 2001 -2008 25
Complete Normalization taskid tname Task taskid tname empno + taskid hrs. Per. Week Assign Employee empno ename Proj. Assign hr. Per. Week startdate Project pno pname © Ellis Cohen 2001 -2008 26
PURE 3 rd Normal Form © Ellis Cohen 2001 -2008 27
2 NF & 3 NF Normalizations • The decomposition rules are – Create a new relation/class with the determinant and everything dependent upon it – Relational: Remove everything dependent upon the determinant from the original relation – Conceptual: Remove the determinant and everything dependent upon it from the original class • Each normal form variant results in a different design pattern – Relational fingerprints vary wrt resulting primary and foreign keys – Conceptual models vary wrt primary keys, discriminators, and kinds of relationships © Ellis Cohen 2001 -2008 28
3 NF: 3 rd Normal Form A table is in 3 NF if every non-prime attribute is minimally determined by a candidate key If not, we have – 2 NF Violation a non-prime attribute is determined by only part of a candidate key – 3 NF-ONLY Violations the determinant of a non-prime attribute includes some other non-prime attribute © Ellis Cohen 2001 -2008 29
2 NF/Pure 3 NF Violations B A X 2 NF Partial Key Violation 3 NF Transitivity Violation Candidate Key A B C X Pure prime attributes non-prime attributes © Ellis Cohen 2001 -2008 30
Pure 3 NF Violation Example Emps empno ename address deptno dname empno prime attributes deptno dname non-prime attributes © Ellis Cohen 2001 -2008 31
Relational Pure 3 NF Normalization Normalize by 1) Create a new relation with the determinant and everything dependent upon it 2) Remove everything dependent upon the determinant from the original relation 3) Add foreign key constraint, possibly w cascading delete deptno dname Empsold empno ename address deptno dname Empsnew empno ename address deptno © Ellis Cohen 2001 -2008 Depts deptno dname 32
Corresponding Conceptual Model Emps Depts empno ename address deptno dname Employee Dept empno ename address © Ellis Cohen 2001 -2008 deptno dname 33
Another Pure 3 NF Example Video( vidid, title, year, custid, studio ) title + year studio Videos vidid title year custid studio 61809 Mouth 1988 S 611 Universal 30512 Zanzibar 1931 S 611 Paramount 49177 Follow Me 2001 S 611 Universal 21534 Mouth 1988 P 309 Universal 30857 Zanzibar 1993 F 247 Disney prime attributes non-prime attributes © Ellis Cohen 2001 -2008 34
Corresponding Conceptual Model Videos vidid title year custid Films title year studio Video Film title year studio vidid custid © Ellis Cohen 2001 -2008 35
Conceptual Pure 3 NF Normalization Normalize by 1) Move determinant and everything dependent on it to a new entity class. 2) The original entity class becomes the child entity class in a 1: M relationship with the new class Note how FD's become unique or key constraints Video vidid custid vidid title year custid studio title + year studio Child entity class Parent entity class Video Film vidid custid vidid title, year title year studio © Ellis Cohen 2001 -2008 What about Video's lifetime? 36
Deciding Lifetime Dependency is a copy of a Video Film title year studio vidid custid Should we treat Video as a strong entity class, or as a dependent entity class (i. e. with a lifetime dependency constraint) Normalization doesn't tell us what to do here. This is an independent design issue. is a copy of a Video Film title year studio vidid custid © Ellis Cohen 2001 -2008 37
MIXED 3 rd Normal Form © Ellis Cohen 2001 -2008 38
2 NF/3 NF Violation Diagrams B A X 2 NF Partial Key Violation 3 NF Transitivity Violation A C B A B prime attributes C X Pure X Mixed non-prime attributes © Ellis Cohen 2001 -2008 39
MIXED 3 NF Violation Example Emps divnam empno ename deptno dname divnam + deptno dname deptno dname MIXED empno prime attributes non-prime attributes © Ellis Cohen 2001 -2008 40
3 NF: 3 rd Normal Form (MIXED) Mixed Transitivity Violation: The determinants are a mixture of prime and non-prime attributes Emp( divnam, empno, ename, deptno, dname ) divnam + deptno dname Emps divnam empno ename deptno dname Widgets 7499 ALLEN 30 SALES Widgets 7654 MARTIN 30 SALES Widgets 6922 FUDGE 10 ACCOUNTING Novatone 7698 BLAKE 30 RESEARCH Novatone 7499 SHAKUR 10 ACCOUNTING Novatone 7844 TURNER 50 SUPPORT prime attributes non-prime attributes © Ellis Cohen 2001 -2008 41
Relational Mixed 3 NF Normalization Normalize by 1) Create a new relation with the determinant and everything dependent upon it 2) Remove everything dependent upon the determinant from the original relation 3) Add foreign key constraint, possibly w cascading delete divnam + deptno dname Empsold divnam empno ename deptno dname Empsnew divnam empno ename deptno Depts divnam deptno dname What's the corresponding conceptual model? © Ellis Cohen 2001 -2008 42
Potential Weak Entity Class Model Emps divnam empno ename deptno Depts divnam deptno dname Does this conceptual model correspond? Employee Dept divnam deptno dname empno ename © Ellis Cohen 2001 -2008 43
Weak Entity Class Model Dependency Employee Dept divnam deptno dname empno ename This model implies that an employee is identified relative to to a Dept Emps But the result of normalization identifies an employee relative to a Division divnam empno ename deptno © Ellis Cohen 2001 -2008 44
Relational Mapping Employee Dept divnam deptno dname empno ename deptno is also part of the primary key Depts Emps divnam empno ename deptno divnam deptno dname Can we remove deptno as part of the primary key? © Ellis Cohen 2001 -2008 45
Shrinking Primary Key via Constraints Emps divnam empno ename deptno Depts divnam deptno dname If we can ensure that divnam + empno is a candidate key then deptno does not need to be part of Emps' primary key. So, we can use the ER model below so long as we add the conceptual state constraint: Within any given division, employee numbers are unique. This means that an Employee is uniquely identified by divnam + empno, so deptno is not needed in the primary key Employee Dept divnam deptno dname empno ename © Ellis Cohen 2001 -2008 46
Conceptual Mixed 3 NF Normalization Normalize by 1) Move determinant and everything dependent on it to a new entity class. 2) The original class becomes a weak entity class, with the remaining parts of the key as the discriminator Note how FD's become unique or key constraints Employee divnam + empno ename divnam empno ename deptno dname Employee empno ename divnam + deptno dname Dept divnam + empno deptno divnam deptno dname © Ellis Cohen 2001 -2008 + A dept with employees can't be deleted 47
Factoring & MIXED 3 rd Normal Form © Ellis Cohen 2001 -2008 48
Potential 1: M Model Emps divnam empno ename deptno Depts divnam deptno dname Does this conceptual model correspond? Employee Dept divnam deptno dname divnam empno ename © Ellis Cohen 2001 -2008 49
Actual Relational Mapping Employee Dept divnam deptno dname divnam empno ename Nothing ensures that an employee is in the same division as the employee's department! Emps divnam empno ename ddivnam deptno Depts divnam deptno dname Can we fix this by adding a state constraint? © Ellis Cohen 2001 -2008 50
Constrained Relational Mapping Employee Dept divnam deptno dname divnam empno ename + Conceptual State Constraint: an employee is in the same division as the employee's department Emps divnam empno ename deptno Depts divnam deptno dname © Ellis Cohen 2001 -2008 51
Adding a Division Class Because divnam shows up in both Employee & Dept, it suggests that it should be represented by its own Division Class Division divnam Employee Dept divnam deptno dname divnam empno ename Are there any problems with this model? © Ellis Cohen 2001 -2008 52
Induced Entity Attributes Division divnam Employee Dept divnam deptno dname divnam empno ename The divnam attributes in Employee and Dept are both now illegal entity attributes. © Ellis Cohen 2001 -2008 53
Using Weak Entity Classes Division divnam Employee works for Dept deptno dname empno ename + Conceptual State Constraint: an employee is in the same division as the employee's department This constraint can be depicted directly on the ER diagram! © Ellis Cohen 2001 -2008 54
Using Factored Relationships Crow Magnum Division divnam Employee empno ename works for (by Division) Dept deptno dname Factored Relationship: An employee of a given division works for a department in the same division © Ellis Cohen 2001 -2008 55
Containment Model in UML Division UML PK Employee* DK empno ename * divnam 0. . 1 Dept* DK deptno dname Not quite standard due to DK & because UML doesn't allow attributes in containment diagrams Everything contained in the outer box holds for a single division at a time © Ellis Cohen 2001 -2008 56
Crow Magnum Containment A division has many (*) employees Underlining indicates an identifying relationship with Division has* Employee empno ename Crow Magnum Division divnam works for An employee is identified wrt a division has* Dept deptno dname Everything inside the outer box holds for a single division at a time An employee of a given division works for a department in the same division © Ellis Cohen 2001 -2008 57
Normalization with Nested Determinants © Ellis Cohen 2001 -2008 58
Conceptual Normalization can be used initially at the conceptual level to improve designs. However, when there are multiple related composite functional dependencies, relational normalization may still be required (though can sometimes be avoided by doing conceptual normalizations in the right order) © Ellis Cohen 2001 -2008 59
Nested Determinants Video( vidid, acqdate, empno, ename, renttime, custid, cphone ) Candidate Keys: vidid Minimal Non Key FD's custid + renttime Nested Determinants empno ename start with these custid cphone custid + renttime empno Do Conceptual Normalization © Ellis Cohen 2001 -2008 60
Split out Customer & Employee empno ename custid cphone Simple 3 NF Transitivity Violations Customer No further conceptual normalization is possible custid cphone Video Employee vidid acqdate renttime Draw the Relational Schema empno ename © Ellis Cohen 2001 -2008 61
Split out Customer & Employee Customer custid cphone Video Customers custid cphone Videos vidid acqdate custid renttime empno Employee vidid acqdate renttime empno ename Employees empno ename custid + renttime empno is a 3 NF violation only in the Relational Model. Resolve it using Relational Normalization © Ellis Cohen 2001 -2008 62
Relational Splicing custid + rentime empno Pure 3 NF Violation Customers custid cphone Rentals custid renttime empno Videos vidid acqdate custid renttime empno Employees empno ename Normalization tells us how to decompose, but not how to connect (w FK constraints). In this case, we had to splice Rentals in between Customers & Employees © Ellis Cohen 2001 -2008 63
Alternate Decomposition Video( vidid, acqdate, empno, ename, renttime, custid, cphone ) Candidate Keys: vidid Minimal Non Key FD's custid + renttime Nested Determinants start with empno ename this custid cphone custid + renttime empno Do Conceptual Normalization © Ellis Cohen 2001 -2008 64
Split out Rental Video custid renttime empno cphone ename vidid acqdate Now decompose using empno ename custid cphone © Ellis Cohen 2001 -2008 65
Complete Conceptual Normalization Splitting Rental out first allows a complete Conceptual Normalization Rental custid renttime empno cphone ename Customer custid cphone Rental Employee vidid acqdate Now decompose using empno ename custid cphone via 2 NF Video renttime Video vidid acqdate via Pure 3 NF empno ename © Ellis Cohen 2001 -2008 Method: Decompose first using FD's with largest # of determinant fields e. g. custid + renttime empno 66
Normalization with Overlapping Determinants © Ellis Cohen 2001 -2008 67
Overlapping Determinants Assign ( empno, task, startdate, payrate ) Candidate Keys: empno + task Minimal Non Key FDs: empno + pno startdate empno + task payrate empno + task Overlapping Determinants Do Conceptual Normalization starting with either FD © Ellis Cohen 2001 -2008 68
Incomplete Conceptual Normalization empno + pno startdate Proj. Assign empno startdate task payrate Starting with either FD resolves a 2 NF violation. Either result is in Conceptual Normal Form, but does not result in a complete normalization empno + task payrate Task. Assign empno task payrate pno startdate Pick either one, map to a Relational Schema, and complete the relational normalization © Ellis Cohen 2001 -2008 69
Complete Relational Normalization Proj. Assigns empno startdate Assigns empno task Task. Assigns empno task payrate Reverse Engineer the corresponding Conceptual Diagram Hint: Look at Conceptual Normalization of Mixed 3 NF Violations © Ellis Cohen 2001 -2008 70
Factored Relational Model Proj. Assigns empno startdate Assigns Task. Assigns empno task payrate Factor out empno What's left is an M: N relationship between Proj. Assigns & Task. Assigns © Ellis Cohen 2001 -2008 71
Factored Normalization Answer Employee empno has* Proj. Assign corresponds to has* Task. Assign pno startdate task payrate Everything inside the outer box holds for a single employee at a time For any given employee, there is a many-many relationship between their projects and their tasks © Ellis Cohen 2001 -2008 72
Using Factored Relationships Employee empno Proj. Assign pno startdate corresponds to (by Employee) Task. Assign task payrate For any given employee, there is a many-many relationship between their projects and their tasks © Ellis Cohen 2001 -2008 73
Normalization Methodology 1. Based on your knowledge, expertise, wisdom & prescience, design the best conceptual model you can 2. Do conceptual normalization. • If an entity class has two FDs with nested determinants, resolve the FD with the larger determinant first • If determinants overlap, conceptual normalization will not completely normalize a design 3. Map to a relational schema 4. Do relational normalization © Ellis Cohen 2001 -2008 74
- Slides: 74