Database Normalization Database Normalization What are the problems
Database Normalization
Database Normalization What are the problems here?
Database Normalization Entity Attributes and their types Redundant Data Anomaly and its types Normalization and its forms
Normalization Entity Any thing about which data is stored. Examples: Person, Company, Event
Normalization Entity Property of an entity An entity is represented by a set of attributes. field
Example Attributes Student Id Name Address STUDENT Entity Phone No
Attribute Types Simple(component) attributes Composite attributes Single-valued attributes Multi-valued attributes Derived attributes Key attributes Non-Key attributes
Composite Attribute Name First Name Middle Name Last Name Which one is Composite Attribute?
Domain The set (range) of allowed values for an attribute Roll No. Grade 1 A 2 B 3 F 4 C Domain for Grade attribute { A, B, C, D, F }
Redundant Data Employee Table EMPID NAME 100 Asad 140 DEPT Marketing SALARY COURSE GRADE 42000 IT A Waheed Accounting 39000 Accounting A 100 Asad Marketing 42000 Management B 110 Majid Info Systems 41500 DBMS A
Anomaly An anomaly is an inconsistent or contradictory state of the database.
Anomalies Types of Anomalies Insertion Anomaly l Deletion Anomaly l Update Anomaly l
Insertion Anomaly Occurrence of an error or inconsistency due to the insertion of new data in a table
Example Insertion Anomaly Employee Table EMPID NAME DEPT 100 Asad Marketing 140 Waheed Accounting … … … 110 Majid Info Systems 100 Asad Marketing SALARY COURSE GRADE 42000 IT A 39000 Accounting A … … 41500 DBMS A 4000 Management B …
Deletion Anomaly Occurrence of an error or inconsistency due to the deletion of data in a table A valid fact is lost when a record is deleted.
Example Deletion Anomaly Employee Table EMPID NAME DEPT SALARY COURSE GRADE 100 Asad Marketing 42000 SPSS A 100 Asad Marketing 42000 Management B 140 Waheed Accounting 39000 Accounting A 110 Majid 41500 Intro. to IT B Info Systems
Update Anomaly Occurrence of an error or inconsistency due to the modification of data in a table One occurrence of a fact is changed, but not all occurrences.
Example Update Anomaly Employee Table EMPID NAME DEPT 100 Asad Marketing 140 Waheed Accounting … … … 110 Majid Info Systems 100 Asad Marketing SALARY COURSE GRADE 42000 SPSS A 39000 Accounting A … … 41500 Intro. to IT B 45000 Management B …
Normalization Basic Idea 1. Divide a table into two or more tables and 2. Define relationships between the tables. Objectives Minimize data duplication Efficient data retrieval Less space Less errors
Normalization Two basic conditions 1. No redundant data 2. No anomalies.
Normal Forms 7 Levels / Stages of normalization 1 st three levels often used
7 Normal Forms Unnormalized data 1 NF 2 NF 3 NF BCNF 4 NF 5 NF (PJNF) DKNF Normalized data
Steps in normalization Table with repeating groups Remove Repeating Groups 1 st Normal Form Partial Dependencies 2 nd Normal Form Transitive Dependencies 3 rd Normal Form Determinant / Candidate Keys Boyce-Codd Normal Form Multivalued Dependencies 4 th Normal Form Removing Remaining anomalies 5 th Normal Form (Project Join NF) Removing Remaining anomalies Domain Key Normal Form
Result Card Example Card No. : 1 Issue date: 3/4/8 Student ID: 111 Student Name: ABC Student Address: 45 -C, Lahore. Course ID Course Name Total Marks Obtained Marks 101 W 100 70 102 X 100 50 103 Y 100 80 104 Z 100 60 1 st Representation Grand Total: 260
Before Normalization Crd_No Is_Date S_ID SName Addr C_ID C_Name TMrks OMrks GTtl 1 3/4/8 111 ABC 45 -C 101 W 100 70 260 1 3/4/8 111 ABC 45 -C 102 X 100 50 260 1 3/4/8 111 ABC 45 -C 103 Y 100 80 260 1 3/4/8 111 ABC 45 -C 104 Z 100 60 2 3/4/8 95 MNO 40 -D 103 Y 100 70 230 2 3/4/8 95 MNO 40 -D 104 Z 100 60 230 … … … … … Redundant Data 2 nd Representation
1 st Normal Form There are no repeating attribute-groups.
Before Normalization Crd. No Is. Date SId SName Adr CId CName TMrks OMrks GTtl 1 3/4/8 111 ABC 45 C 101 W 100 70 102 X 100 50 103 Y 100 80 260 2 3/4/8 95 MNO 40 D 103 Y 100 70 104 Z 100 60 … … … … Best Approach 3 rd Representation 230 … … …
Before Normalization Crd_No Is_Date S_ID SName Addr C_ID C_Name TMrks OMrks GTtl 1 3/4/8 111 ABC 45 -C 101 W 100 70 260 1 3/4/8 111 ABC 45 -C 102 X 100 50 260 1 3/4/8 111 ABC 45 -C 103 Y 100 80 260 1 3/4/8 111 ABC 45 -C 104 Z 100 60 2 3/4/8 95 MNO 40 -D 103 Y 100 70 230 2 3/4/8 95 MNO 40 -D 104 Z 100 60 230 … … … … … Crd_No Is_Date S_ID SName 1 3/4/8 111 ABC 45 -C 101 102 103 104 W X Y Z 100 100 70 50 80 60 2 3/4/8 95 MNO 40 -D 103 104 Y Z 100 70 60 230 … … … … … 4 th Representation Addr C_ID C_Name TMrks OMrks GTtl
1 st Normal Form Process Identify the primary key of denormalized table. 2. Repeating attribute-groups of the table are separated out in form of a new table. 3. Primary key of old table is added into new table as foreign key. 4. Identify the primary key of new table. 1.
Repeating Attributes Card No. : 1 Issue date: 3/4/8 Student ID: 111 Student Name: ABC Student Address: 45 -C, Lahore. Course ID Course Name Total Marks Obtained Marks 101 W 100 70 102 X 100 50 103 Y 100 80 104 Z 100 60 Grand Total: 260 Repeating attribute-groups
Repeating Attributes Card No. : 1 Issue date: 3/4/8 Student ID: 111 Student Address: 45 -C, Lahore. Student Name: ABC Course ID Course Name Total Marks Obtained Marks 101 W 100 70 Course ID Course Name Total Marks Obtained Marks 102 X 100 50 Course ID Course Name Total Marks Obtained Marks 103 Y 100 80 Course ID Course Name Total Marks Obtained Marks 104 Z 100 60 5 th Representation Grand Total: 260
1 st Normal Form Card No PK Issue Date Student ID Student Name Student Address Grand Total Card No FK CP Course ID Course Name Total Marks Obtained Marks Table-1 Table-2 }
1 st Normal Form Card No PK Issue Date Student ID Student Name Student Address Grand Total Card No FK CP Course ID Course Name Total Marks Obtained Marks Table-1 Table-2 }
2 nd Normal Form There is no partial dependency. partial Partial dependency: An attribute depends upon the part of the key, but not on the whole key.
2 nd Normal Form Conditions 1. Relation is in 1 NF. 2. All non-key attributes should depend upon Key attribute. 3. Implementable if Compound Key Exist.
2 nd Normal Form Process 1. All non-key attributes that depend upon part of key are separated out in a new table along with part of key on which these attributes depend. 2. Identify the primary key of new table and add it back to the old table as foreign key.
2 nd Normal Form Card No FK CP Course ID Course Name Total Marks Obtained Marks }
2 nd Normal Form Card No FK CP Course ID Course Name Total Marks Obtained Marks } } Card No FK CP Course ID FK Obtained Marks Course ID PK Course Name Total Marks
3 rd Normal Form There is no transitive dependency. Transitive dependency: A non-key attribute depends upon another non-key attribute.
3 rd Normal Form Conditions 1. Relation is in 2 NF. 2. All non-key attributes depend on key attribute.
3 rd Normal Form Process 1. All non-key attributes depending on any non- key attribute are separated out in form of new table. 2. Identify the primary key of new table and add it back to the old table as foreign key.
3 rd Card No PK Issue Date Student ID Student Name Student Address Grand Total Normal Form Card No PK Issue Date Student ID FK Grand Total Student ID PK Student Name Student Address Any thing wrong here?
Normalization Over all Result } Card No FK CP Course ID FK Obtained Marks Card No PK Issue Date Student ID FK Grand Total Student ID PK Student Name Student Address Course ID PK Course Name Total Marks
Practice Exercise Invoice no. : 1 Invoice date: 3/4/8 Customer ID: 111 Customer Name: ABC Customer Address: 45 -C, Lahore Phone : 3333 P_ID P_Name QTY Price Total 101 W 3 20 60 102 X 2 10 20 103 Y 2 30 60 104 Z 1 60 60 Invoice Total: 200
- Slides: 44