Normalization A technique for identifying table structures that
Normalization A technique for identifying table structures that have potential maintenance problems
Normalization z. Normalization is a set of formal conditions that assure that a database is maintainable. z. The results of a well executed normalization process are the same as those of a well planned E-R model
PROCESS OF DATA NORMALIZATION z ELIMINATE REPEATING GROUPS Make a separate table for each set of related attributes and give each table a primary key. z ELIMINATE REDUNDANT DATA If an attribute depends on only part of a multivalued key, remove it to a separate table. z ELIMINATE COLUMNS NOT DEPENDENT ON KEY If attributes do not contribute to a description of the key, remove them to a separate table. Database Programming and Design
PROCESS OF DATA NORMALIZATION z ISOLATE INDEPENDENT MULTIPLE RELATIONSHIPS No table may contain two or more 1: n or n: m relationships that are not directly related. z ISOLATE SEMANTICALLY RELATED MULTIPLE RELATIONSHIPS There may be practical constraints on information that justify separating logically related many-to-many relationships. Database Programming and Design
Anomalies A table anomaly is a structure for which a normal database operation cannot be executed without information loss or full search of the data table Insertion Anomaly z. Deletion Anomaly z. Update or Modification Anomaly
Normal Forms Relational theory defines a number of structure conditions called Normal Forms that assure that certain data anomalies do not occur in a database.
Normal Forms 1 NF 2 NF 3 NF BCNF 4 NF Keys; No repeating groups No partial dependencies No transitive dependencies Determinants are candidate keys No multivalued dependencies
Form (Company Order History) Order # 12003 Date Oct 1, 1997 Oklahoma Retail Company 1111 Asp Norman 1. 2. 3. Description Code Qty Price Amount Footballs Sweat Shirts Shorts 21 44 37 25. 00 15. 00 300 120 Total 6 20 10 570
0 nf: Remove titles and derived quantities Order # 12003 Date Oct 1, 1997 Oklahoma Retail Company 1111 Asp Norman 1. 2. 3. Description Code Qty Price Amount Footballs Sweat Shirts Shorts 21 44 37 25. 00 15. 00 300 120 Total 6 20 10 570
0 Normal Form l l Remove titles and derived quantities Schema notation HISTORY(Cust. Name, Cust. Addr, Cust. City {Order. Num, Order. Date {Prod. Descr, Prod. Code, Qty. Ordered, Order. Price}}
1 st Normal Form Add keys Remove repeating groups
1 st Normal Form l l Add Keys for embedded entities Remove Repeating Groups HISTORY(Cust. ID, Cust. Name, Cust. Addr, Cust. City {Order. Num, Order. Date {Prod. Descr, Prod. Code, Qty. Ordered, Order. Price}}
1 st Normal Form l l Add Keys for embedded entities Remove Repeating Groups Create a table for each embedded entity, from the outside for nested groups Insert foreign keys and junction tables CUSTOMER(Cust. ID, Cust. Name, Cust. Addr, Cust. City) ORDER(Order. Num, Cust. ID, Order. Date {Prod. Descr, Prod. Code, Qty. Ordered, Order. Price})
1 st Normal Form CUSTOMER(Cust. ID, Cust. Name, Cust. Addr, Cust. City) ORDER(Order. Num, Cust. ID, Order. Date) PRODUCT(Prod. Descr, Prod. Code, ) ORDER-PRODUCT(Order. Num, Prod. Code, Qty. Ordered, Order. Price)
1 st Normal Form CUSTOMER ORDER PRODUCT
1 NF (Keys, No Repeating Groups) Table contains multi-valued attributes. TABLE { ATTRIBUTES} ? ? ATTR-TABLE
2 nd Normal Form No partial dependencies (an attribute has a partial dependency if it depends on part of a concatenated key)
2 nd Normal Form ROSTER(Stu. ID, ZAPNum, Student. Name, Course. Title, Course. Grade) z. Remove partial dependencies STUDENT(Stu. ID, Student. Name) SECTION(ZAPNum, Course. Title) STUDENT-SECTION(Stu. ID, ZAPNum, Course. Grade)
2 nd Normal Form ROSTER STUDENT SECTION STUDENT-SECTION
2 NF No partial dependencies Table has data from several connected tables. TABLE ? ?
3 rd Normal Form No transitive dependencies (a transitive dependency is an attribute that depends on other non-key attributes)
3 rd Normal Form Note: a transitive dependency arises when attributes from a second entity appear in a given table. SECTION(ZAPNum, Room. Num, Day, Time, Course. Title, Hours. Credit)
3 rd Normal Form SECTION(ZAPNum, Room. Num, Day, Time, Course. ID , Course. Title, Hours. Credit) SECTION(ZAPNum, Room. Num, Day, Time, Course. ID) COURSE(Course. ID, Course. Title, Hours. Credit)
3 NF No transitive dependencies Table contains data from an embedded entity with non -key attributes. TABLE SUB-TABLE ? ? SUB-TABLE BCNF is the same, but the embedded table may involve key attributes.
Boyce Codd Normal Form Every determinant is a candidate key
BCNF dependenceies are like 3 nf dependencies but they involve some key attributes Note: BCNF often arises when a 1: m relationship is modeled as a m: n relationship
BCNF SALESMAN-CUST(Sales. ID, Cust. ID, Commission) SALESMAN(Sales. ID, Commission) CUSTOMER(Cust. ID, Sales. ID)
4 th Normal Form No multi-valued dependencies
4 th Normal Form Note: 4 th Normal Form violations occur when a triple (or higher) concatenated key represents a pair of double keys
4 th Normal Form
4 th Normal Form Multuvalued dependencies
4 th Normal Form INSTR-BOOK-COURSE(Instr. ID, Book, Course. ID) COURSE-BOOK(Course. ID, Book) COURSE-INSTR(Course. ID, Instr. ID)
4 NF (No multivalued dependencies) Independent repeating groups have been treated as a complex relationship. TABLE TABLE
- Slides: 33