Chapter 8 Normalization Outline Modification anomalies l Functional

  • Slides: 28
Download presentation
Chapter 8 Normalization

Chapter 8 Normalization

Outline Modification anomalies l Functional dependencies l Major normal forms l Relationship independence l

Outline Modification anomalies l Functional dependencies l Major normal forms l Relationship independence l Practical concerns l Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

Modification Anomalies Unexpected side effect l Insert, modify, and delete more data than desired

Modification Anomalies Unexpected side effect l Insert, modify, and delete more data than desired l Caused by excessive redundancies l Strive for one fact in one place l Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

Big University Database Table Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All

Big University Database Table Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

Functional Dependencies Constraint on the possible rows in a table l Value neutral like

Functional Dependencies Constraint on the possible rows in a table l Value neutral like FKs and PKs l Asserted l Understand business rules l Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

FD Definition X Y l X (functionally) determines Y l X: left-hand-side (LHS) or

FD Definition X Y l X (functionally) determines Y l X: left-hand-side (LHS) or determinant l For each X value, there is at most one Y value l Similar to candidate keys l Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

FD Diagrams and Lists Std. SSN Std. City, Std. Class Offer. No Off. Term,

FD Diagrams and Lists Std. SSN Std. City, Std. Class Offer. No Off. Term, Off. Year, Course. No, Crs. Desc Course. No Crs. Desc Std. SSN, Offer. No Enr. Grade Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

FDs in Data • Prove non-existence (but not existence) by looking at data •

FDs in Data • Prove non-existence (but not existence) by looking at data • Two rows that have the same X value but a different Y value Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

Normalization Process of removing unwanted redundancies l Apply normal forms l – Identify FDs

Normalization Process of removing unwanted redundancies l Apply normal forms l – Identify FDs – Determine whether FDs meet normal form – Split the table to meet the normal form if there is a violation Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

Relationships of Normal Forms Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All

Relationships of Normal Forms Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

1 NF Starting point for SQL 2 databases l No repeating groups: flat rows

1 NF Starting point for SQL 2 databases l No repeating groups: flat rows l Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

Combined Definition of 2 NF/3 NF Key column: candidate key or part of candidate

Combined Definition of 2 NF/3 NF Key column: candidate key or part of candidate key l Analogy to the traditional justice oath l Every nonkey depends on a key, the whole key, and nothing but the key l Usually taught as separate definitions l Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

2 NF Every nonkey column depends on a whole key, not part of a

2 NF Every nonkey column depends on a whole key, not part of a key l Violations l – Part of key nonkey – Violations only for combined keys Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

2 NF Example l Many violations for the big university database table – Std.

2 NF Example l Many violations for the big university database table – Std. SSN Std. City, Std. Class – Offer. No Off. Term, Off. Year, Course. No, Crs. Desc l Splitting the table – Univ. Table 1 (Std. SSN, Std. City, Std. Class) – Univ. Table 2 (Offer. No, Off. Term, Off. Year, Course. No, Crs. Desc) Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

3 NF Every nonkey column depends only on a key not on nonkey columns

3 NF Every nonkey column depends only on a key not on nonkey columns l Violations: Nonkey l Alternative formulation l – No transitive FDs – A B, B C then A C – Offer. No Course. No, Course. No Crs. Desc then Offer. No Crs. Desc Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

3 NF Example l One violation in Univ. Table 2 – Course. No Crs.

3 NF Example l One violation in Univ. Table 2 – Course. No Crs. Desc l Splitting the table – Univ. Table 2 -1 (Offer. No, Off. Term, Off. Year, Course. No, Crs. Desc) – Univ. Table 2 -2 (Course. No, Crs. Desc) Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

BCNF Every determinant must be a candidate key l Simpler definition l Apply with

BCNF Every determinant must be a candidate key l Simpler definition l Apply with simple synthesis procedure l Special case not covered by 3 NF l – Part of key – Special case is not common Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

BCNF Example l Many violations for the big university database table – Std. SSN

BCNF Example l Many violations for the big university database table – Std. SSN Std. City, Std. Class – Offer. No Off. Term, Off. Year, Course. No, Crs. Desc – Course. No Crs. Desc l Splitting into four tables Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

Simple Synthesis Procedure 1. Eliminate extraneous columns from the LHSs. 2. Remove derived FDs.

Simple Synthesis Procedure 1. Eliminate extraneous columns from the LHSs. 2. Remove derived FDs. 3. Arrange the FDs into groups with each group having the same determinant. 4. For each FD group, make a table with the determinant as the primary key. 5. Merge tables in which one table contains all columns of the other table. Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

Simple Synthesis Example Step 1: no extraneous columns l Step 2: eliminate Offer. No

Simple Synthesis Example Step 1: no extraneous columns l Step 2: eliminate Offer. No Crs. Desc l Step 3: already arranged by LHS l Step 4: four tables (Student, Enrollment, Course, Offering) l Step 5: no redundant tables l Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

Relationship Independence and 4 NF M-way relationship that can be derived from binary relationships

Relationship Independence and 4 NF M-way relationship that can be derived from binary relationships l Split into binary relationships l Specialized problem l 4 NF does not involve FDs l Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

Relationship Independence Problem Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights

Relationship Independence Problem Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

Relationship Independence Solution Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights

Relationship Independence Solution Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

MVDs and 4 NF l MVD: difficult to identify – A B | C

MVDs and 4 NF l MVD: difficult to identify – A B | C (multi-determines) – A associated with a collection of B and C values – B and C are independent – Nontrivial MVD: not also an FD l 4 NF: no nontrivial MVDs Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

Higher Level Normal Forms 5 NF for M-way relationships l DKNF: absolute normal form

Higher Level Normal Forms 5 NF for M-way relationships l DKNF: absolute normal form l DKNF is an ideal, not a practical normal form l Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

Role of Normalization l Refinement – Use after ERD – Apply to table design

Role of Normalization l Refinement – Use after ERD – Apply to table design or ERD l Initial design – Record attributes and FDs – No initial ERD – May reverse engineer an ERD Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

Normalization Objective Update biased l Not a concern for databases without updates (data warehouses)

Normalization Objective Update biased l Not a concern for databases without updates (data warehouses) l Denormalization l – Purposeful violation of a normal form – Some FDs may not cause anomalies – May improve performance Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.

Summary Beware of unwanted redundancies l FDs are important constraints l Strive for BCNF

Summary Beware of unwanted redundancies l FDs are important constraints l Strive for BCNF l Use a CASE tool for large problems l Important tool of database development l Focus on the normalization objective l Mc. Graw-Hill/Irwin © 2001 The Mc. Graw-Hill Companies, Inc. All rights reserved.