Database Design Part 2 Normalization Chapter 9 1
Database Design Part 2 - Normalization Chapter 9 1
Outline – Part 2 v. Poor Data Design Anomalies Example: − Creating Normalized Tables for the Courses Completed Report v. Normalization Summarizing Normalization − 1 NF − Functional Dependency − 2 NF − 3 NF Practice Problems − Do this outside of class for practice 2
Poor Data Design This is why we need to learn Normalization! Appointment Table Appointment 12/2/2020 Appt. Type: Appt No 1 2 3 4 5 6 7 8 9 10 Appt Date 12/1/2020 12/2/2020 12/2/2020 12/3/2020 Appt Planned Appt Time Duration Type 3: 00 AM 1. 00 Physical 3: 00 AM 0. 25 Shot 3: 15 AM 0. 50 Flu 10: 00 AM 0. 50 Migraine 10: 15 AM 0. 25 Shot 10: 30 AM 0. 25 Shot 10: 45 AM 0. 50 Flu 11: 00 AM 1. 00 Physical 10: 30 AM 1. 00 Physical 9: 00 AM 0. 50 Migraine Patient First ID Nm 466927 Lisa 456789 Sue 194756 Brandon 329657 Marcus 987453 Mike 384788 Tonya 438754 Iliana 345875 Carla 466927 Lisa 345875 Carla Last Nm Garcia Carey Pierre Schwartz Jones Johnson Hnatt Basich Garcia Basich Phone 562 -3456 432 -1234 432 -7877 239 -5502 456 -0202 432 -8806 823 -4303 857 -5566 562 -3456 857 -5566 Doctor ID C 678 A 528 S 626 A 528 G 123 S 626 C 678 A 528 C 678 Doctor Nm Chapman Lopez Smith Lopez Gray Smith Chapman Lopez Chapman Carla called to inquire about her appointment time on 12/2/2020. She also gives us her new phone number: 777 -1234 Update Anomaly Mike called to cancel his appointment. Deletion Anomaly We need to add our new doctor (Dr. Jones) to the table. Insertion Anomaly 3
Normalization 4
Normalization v(def) the process of converting complex data structures into simple, stable data structures. Every non-primary key attribute depends upon the whole primary key and nothing but the primary key. Purpose: Create well-structured relations/tables Why? Data Redundancy = Data Quality & Integrity Result: where we can insert, modify, and delete the rows without errors or inconsistencies. Every fact in only one location! 5
Normalization 1 NF Every field has Atomic Values. No multi-valued attributes! 2 NF The table is in 1 NF, and … No Partial Dependencies! 3 NF The table is in 2 NF, and … No Transitive Dependencies! every non-key attribute is functionally dependent on the entire primary key each non-key attribute is not functionally dependent on another non-key attribute 6
First Normal Form (1 NF) A table is in 1 NF if every field has atomic values. i. e. the table has no multi-valued attributes. Original Table: Employee (Emp. ID, Last. Name, First. Name, Title, Dependents) Employee Table Bad Solution: Emp. ID Last. Name First. Name Title Dependents 111 Smith Robert Accountant Bobbie, Sue 222 Jones Leo Programmer 333 Lopez Trent Sales Rep Trent Jr, Trevor, Sue Employee (Emp. ID, Last. Name, First. Name, Title, Dep 1, Dep 2, Dep 3) Employee Table Emp. ID Last. Name First. Name Title Dep 1 Dep 2 111 Smith Robert Accountant Bobbie Sue 222 Jones Leo Programmer 333 Lopez Trent Sales Rep Trent Jr Trevor Dep 3 Creating a “Repeating field” is a poor design decision! Sue 7
First Normal Form (1 NF) Original Table: Employee (Emp. ID, Last. Name, First. Name, Title, Dependents) Employee Table Emp. ID Last. Name First. Name Title Dependents 111 Smith Robert Accountant Bobbie, Sue 222 Jones Leo Programmer 333 Lopez Trent Sales Rep Trent Jr, Trevor, Sue Solution: 1 NF Employee (Emp. ID, Last. Name, First. Name, Title) Employee Table Emp. Dep (Emp. ID, Dep. Name) Emp. Dep Table Emp. ID Last. Name First. Name Title Emp. ID Dep. Name 111 Smith Robert Accountant 111 Bobbie 222 Jones Leo Programmer 111 Sue 333 Lopez Trent Sales Rep 333 Trent Jr 333 Trevor 333 Sue 8
Functional Dependency X Y (def) Field Y is functionally dependent on Field X if for each value of X, there is only 1 corresponding value of Y. “Field X functionally determines Field Y” i. e. If we know the value of X, we can obtain the value of Y. Order. ID Order. Date There is exactly one Order. Date for any given instance of an order. • However, the opposite may not be true. 9
Functional Dependency X Y (def) Field Y is functionally dependent on Field X if for each value of X, there is only 1 corresponding value of Y. Last. Name Student. ID First. Name Birth. Date Student. ID Hometown Last. Name First. Name Birth. Date Functional Dependency involves a one-to-one relationship between the values of fields 10
Second Normal Form (2 NF) A table is in 2 NF if each non-key attribute is functionally dependent on the entire primary key. Original Table: 1 NF ________ (Catalog. ID, Product. ID, Price, Cat. Issue. Date) Catalog. ID Product. ID Price 10 1352 $ 20. 00 10 1353 $ 25. 00 10 1354 $ 30. 00 10 1355 $ 35. 00 10 1356 $ 40. 00 11 1352 $ 20. 00 11 1353 $ 27. 50 11 1354 $ 33. 00 11 1355 $ 38. 50 11 1356 $ 44. 00 Analyze it! Cat. Issue. Date 1/7/2019 1/7/2019 1/6/2020 1/6/2020 • What is the Primary Key? • What is the table name? Catalog. ID, Product. ID _________ Catalog. ID _________ Product. ID _________ 11
Second Normal Form (2 NF) Solution: 2 NF Catalog. Product (Catalog. ID, Product. ID, Price) Catalog. Product Catalog. ID Product. ID 10 1352 10 1353 10 1354 10 1355 10 1356 11 1352 11 1353 11 1354 11 1355 11 1356 Catalog (Catalog. ID, Cat. Issue. Date) $ $ $ $ $ Price 20. 00 25. 00 30. 00 35. 00 40. 00 27. 50 33. 00 38. 50 44. 00 Catalog. ID Cat. Issue. Date 9 1/1/2018 10 1/7/2019 11 1/6/2020 12 1/4/2021 12
Third Normal Form (3 NF) A table is in 3 NF if no non-key attribute is functionally dependent on any other non-key attribute Original Table: 2 NF ________ (Acct. No, Address, State, Zip. Code) Acct. No 1111 2222 3333 4444 5555 Analyze it! Address 123 Pine St 456 Bagby 987 John Rd 876 Ave K 543 8 th St State Zip. Code TX 76712 NM 87123 UT 84697 NM 87123 • What is the Primary Key? • What is the table name? Address _________ State _________ Zip. Code _________ 13
Third Normal Form (3 NF) Solution: 3 NF Account (Acct. No, Address, Zip. Code) Account Acct. No 1111 2222 3333 4444 5555 Zip (Zip. Code, State) Zip Address 123 Pine St 456 Bagby 987 John Rd 876 Ave K 543 8 th St Zip. Code 76712 87123 84697 87123 Zip. Code State 76712 TX 84697 UT 87123 NM 14
Creating Normalized Tables for the Courses Completed Report 15
Transform this report into 1 NF Courses Completed Report Emp. ID Name 100 John Dept. Code MKTG Date Completed Dept Phone Ext. Salary Course 42000 SPSS 6/19/2018 5325 Surveys 11/3/2019 5325 140 Sue ACCT 41000 Tax Acc 12/1/2019 4422 110 Bob INFO 70000 Java 3/21/2018 7373 10/23/2020 7373 2/1/2018 4477 C# 190 Alex FINA 65000 Investmts. 150 Jennifer INFO 49000 Java 3/11/2017 7373 Oracle 5/19/2018 7373 Notice in this report that Employees can take multiple courses. 16
Now, it’s in 1 NF: Each attribute is atomic! Solution: 1 NF ______ (Emp. ID, Name, Dept. Code, Salary, Course, Date Completed, Dept. Phone. Ext) Emp. ID Name Dept. Code Salary Course Date Completed Dept Phone Ext. 100 John MKTG 42000 SPSS 6/19/2018 5325 100 John MKTG 42000 Surveys 11/3/2019 5325 140 Sue ACCT 41000 Tax Acc 12/1/2019 4422 110 Bob INFO 70000 Java 3/21/2018 7373 110 Bob INFO 70000 C# 10/23/2020 7373 190 Alex FINA 65000 Investmts. 2/1/2018 4477 150 Jennifer INFO 49000 Java 3/11/2017 7373 150 Jennifer INFO 49000 Oracle 5/19/2018 7373 • What is the Primary Key? _________ • What is the table’s name? _________ 17
Transform this table from 1 NF to 2 NF v Ensure that all non-key attributes are functionally dependent on the entire primary key! – i. e. No Partial Dependencies Original Table: 1 NF Emp. Course (Emp. ID, Course, Name, Dept. Code, Salary, Date Completed, Dept. Phone. Ext) Emp. ID Analyze it! Course Name Dept. Code Salary Date Completed Dept Phone Ext. 100 SPSS John MKTG 42000 6/19/2018 5325 100 Surveys John MKTG 42000 11/3/2019 5325 140 Tax Acc Sue ACCT 41000 12/1/2019 4422 110 Java Bob INFO 70000 3/21/2018 7373 110 C# Bob INFO 70000 10/23/2020 7373 190 Investmts. Alex FINA 65000 2/1/2018 4477 150 Java Jennifer INFO 49000 3/11/2017 7373 150 Oracle Jennifer INFO 49000 5/19/2018 7373 Emp. ID, Course Emp. ID Course 18
Now, both are in 2 NF: each non-key attribute is functionally dependent on the entire primary key. Solution: 1 NF Emp. Course( Emp. ID, Course, Date. Completed) Emp. ID Course Date Completed 100 SPSS 6/19/2018 100 Surveys 11/3/2019 140 Tax Acc 12/1/2019 110 Java 3/21/2018 110 C# 190 Investmts. 150 Java 3/11/2017 150 Oracle 5/19/2018 10/23/2020 2/1/2018 Employee (Emp. ID, Name, Dept. Code, Salary, Dept. Phone. Ext) Emp. ID Name Dept. Code Salary Dept Phone Ext. 100 John MKTG 42000 5325 140 Sue ACCT 41000 4422 110 Bob INFO 70000 7373 190 Alex FINA 65000 4477 150 Jennifer INFO 49000 7373 Tip: _______ 19
Transform these tables from 2 NF to 3 NF � Ensure that no non-key attribute is functionally dependent on any other non-key attribute - i. e. no transitive dependencies Employee Emp. Course Emp. ID Course Date Completed 100 SPSS 6/19/2018 100 Surveys 11/3/2019 140 Tax Acc 12/1/2019 110 Java 3/21/2018 110 C# 190 Investmts. 150 Java 3/11/2017 150 Oracle 5/19/2018 10/23/2020 2/1/2018 Tip: _______ Emp. ID Name Dept. Code Salary Dept Phone Ext. 100 John MKTG 42000 5325 140 Sue ACCT 41000 4422 110 Bob INFO 70000 7373 190 Alex FINA 65000 4477 150 Jennifer INFO 49000 7373 Name ______ Dept. Code ______ Salary ______ Dept. Ph. Ext ______ 20
Now, these are in 3 NF: no non-key attribute is functionally dependent on any other non-key attribute Solution: 3 NF Employee(Emp. ID, Name, Dept. Code, Salary) Emp. ID Name Dept. Code Emp. Course(Emp. ID, Course, Date Completed Salary 100 John MKTG 42000 140 Sue ACCT 41000 110 Bob INFO 70000 190 Alex FINA 65000 150 Jennifer INFO 49000 Department(Dept. Code, Dept. Phone. Ext) Dept. Code Emp. ID Course Date Completed 100 SPSS 6/19/2018 100 Surveys 11/3/2019 140 Tax Acc 12/1/2019 110 Java 3/21/2018 110 C# 190 Investmts. 150 Java 3/11/2017 150 Oracle 5/19/2018 10/23/2020 2/1/2018 Dept. Phone Ext. MKTG 5325 ACCT 4422 INFO 7373 FINA 4477 21
Using Normalization we learned… v that to represent the data on this report in a relational database, we will need to create 3 separate tables. Emp. ID 100 140 110 Name John Dept. Code MKTG yee o l p INFO m EBob Sue ACCT Salary 42000 41000 70000 190 Alex FINA 65000 150 Jennifer INFO 49000 Emp. ID Dept. Code Dept Phone Ext. MKTG 5325 ACCT 4422 INFO t Dep 7373 FINA 4477 Course Date Completed 100 SPSS 6/19/2018 100 Surveys 11/3/2019 140 Tax Acc 12/1/2019 110 Java 110 C# 190 Investmts. 150 Java 3/11/2017 150 Oracle 5/19/2018 Em rse 10/23/2020 u o C p 3/21/2018 2/1/2018 22
Here's our final solution! Domain Class Diagram Table Notation Emp. Course 0. . * Employee Emp. ID {key} Name Dept. Code Salary Emp. ID {key} Course {key} Date. Completed Employee (Emp. ID, Name, Dept. Code, Salary) Emp. Course (Emp. ID, Course, Date. Completed) Department (Dept, Dept. Phone. Ext) 1. . 1 Dept 1. . * 1. . 1 Dept. Code {key} Dept. Phone. Ext Denote the foreign keys 23
Summarizing Normalization 24
Normalization 1 NF Every field has Atomic Values. No multi-valued attributes! e. g. Employee(Emp. ID, Last. Name, First. Name, Title, Dependents) 2 NF The table is in 1 NF, and … every non-key attribute is functionally dependent on the entire primary key No Partial Dependencies! e. g. Emp. Course(Emp. ID, Course, Name, Dept. Code, Salary, Dt. Completed, Dept. Phone. Ext) 3 NF The table is in 2 NF, and … each non-key attribute is not functionally dependent on another non-key attribute No Transitive Dependencies! e. g. Employee(Emp. ID, Name, Dept. Code, Salary, Dept. Phone. Ext) If Emp. ID Dept. Code & Dept. Code Dept. Phone. Ext, Red text - examples of violations of Normalization. then Emp. ID Dept. Phone. Ext 25
Additional Practice Problems 26
Practice Problems v. Work all problems on the remaining slides and determine the solution for each Normal Form 1 NF: … 2 NF: … 3 NF: … List your solution in table notation as shown below: − Primary keys – underlined, Foreign keys – italicized − For example: • Student (Stud. ID, Last. Name, First. Name, Major. Code) • Major (Major. Code, Name) 27
Practice Problem #1 - Order. Product Table Order No Prod No Cust No Name Addr City St Zip Order. Dt Promised Dt Desc Qty Ord Unit Price 61384 M 128 1273 Cont. Designs 123 Oak Austin TX 28384 11/04/2020 11/21/2020 Bookcase 4 200 61384 B 381 1273 Cont. Designs 123 Oak Austin TX 28384 11/04/2020 11/21/2020 Cabinet 2 150 61384 R 210 1273 Cont. Designs 123 Oak Austin TX 28384 11/04/2020 11/21/2020 Table 1 500 62890 A 891 3891 J Consultants 523 Pine Waco TX 76712 11/15/2020 11/21/2020 Chair 2 300 62890 M 128 3891 J Consultants 523 Pine Waco TX 76712 11/15/2020 11/21/2020 Bookcase 8 200 63129 A 891 1273 Cont. Designs 123 Oak Austin TX 28384 12/10/2020 12/29/2020 Chair 6 300 Ord. Prod (Order. No, Prod. No, Cust. No, Name, Addr, City, St, Zip, Order. Dt, Promised. Dt, Desc, Qty. Ord, Unit. Price) Step 1 - Put this table in 3 rd Normal Form 1. It’s already in 1 NF …but you need to understand why! 2. Convert this to 2 NF. 3. Convert this to 3 NF. Step 2 - Identify the Foreign Keys 28
Practice Problem #2 - Appointment Table Appt No 1 2 3 4 5 6 7 8 9 10 Appt Date 12/1/2020 12/2/2020 12/2/2020 12/3/2020 Appt Planned Appt Time Duration Type 3: 00 AM 1. 00 Physical 3: 00 AM 0. 25 Shot 3: 15 AM 0. 50 Flu 10: 00 AM 0. 50 Migraine 10: 15 AM 0. 25 Shot 10: 30 AM 0. 25 Shot 10: 45 AM 0. 50 Flu 11: 00 AM 1. 00 Physical 10: 30 AM 1. 00 Physical 9: 00 AM 0. 50 Migraine Patient First ID Nm 466927 Lisa 456789 Sue 194756 Brandon 329657 Marcus 987453 Mike 384788 Tonya 438754 Iliana 345875 Carla 466927 Lisa 345875 Carla Last Nm Garcia Carey Pierre Schwartz Jones Johnson Hnatt Basich Garcia Basich Phone 562 -3456 432 -1234 432 -7877 239 -5502 456 -0202 432 -8806 823 -4303 857 -5566 562 -3456 857 -5566 Doctor ID C 678 A 528 S 626 A 528 G 123 S 626 C 678 A 528 C 678 Doctor Nm Chapman Lopez Smith Lopez Gray Smith Chapman Lopez Chapman Appt (Appt. No, Appt. Dt, Appt. Tm, Planned. Dur, Appt. Type, Patient. ID, First. Nm, Last. Nm, Phone, Doctor. ID, Doctor. Nm) Step 1 - Put this table in 3 rd Normal Form 1. It’s already in 1 NF, and 2 NF …but you need to understand why! 2. Convert this to 3 NF. Step 2 - Identify the Foreign Keys 29
Practice Problem #3 - Invoice Table Invoice. No 1 2 3 4 5 Invoice. Date 01/15/2020 03/21/2020 03/25/2020 06/02/2020 08/19/2020 Item. No 100, 200 100 300 500 300, 200, 100 Cust. ID 466927 456789 194756 329657 987453 Last. Name Garcia Carey Pierre Schwartz Jones First. Name Lisa Sue Brandon Marcus Mike Step 1 - Put this table in 3 rd Normal Form 1. Convert this to 1 NF. 2. Convert this to 2 NF. 3. Convert this to 3 NF. Step 2 - Identify the Foreign Keys 30
Practice Problems – from the textbook v. Normalize these examples from Chapter 9 in the textbook: Exercise 6 Exercise 7 31
- Slides: 31