Functional Dependencies and Normalization Normalization n n Normalization

  • Slides: 30
Download presentation
Functional Dependencies and Normalization

Functional Dependencies and Normalization

Normalization n n Normalization is a formalized procedure to eliminating redundancy from data by

Normalization n n Normalization is a formalized procedure to eliminating redundancy from data by the progressive use of ‘non-lose decomposition’, which involves splitting records without losing information. In reducing the data model to the state where each bit of information is only held in one place, the update process is much simpler, more efficient and inconsistencies in the database are impossible.

Normalization (cont. ) 5 NF 4 NF 3 NF 2 NF 1 NF Redundancy

Normalization (cont. ) 5 NF 4 NF 3 NF 2 NF 1 NF Redundancy

Normalization (cont. ) n n Normalization is based on the idea that an attribute

Normalization (cont. ) n n Normalization is based on the idea that an attribute may depend on another attribute in some way. There are 2 different kinds of dependencies involved up to 5 NF n n Functional dependency Multivalued dependence

Functional Dependence S#, P# S# CITY P# QTY S 1 Khon Kaen P 1

Functional Dependence S#, P# S# CITY P# QTY S 1 Khon Kaen P 1 100 S 1 Khon Kaen P 2 100 S 2 Saraburii P 1 200 S 2 Saraburii P 2 200 S 3 Saraburii P 2 300 S 4 Bangkok P 2 400 S 4 Bangkok P 4 400 S 4 Bangkok P 5 400 QTY is functionally dependent on S#and. P# S# and P# are the determinant of QTY

Functional Dependence n n In a relation including attribute A and B, B is

Functional Dependence n n In a relation including attribute A and B, B is functional dependent on A if, for every valid occurrence, the value A determines the value B An occurrence can not be used to show that a dependency is true, only that it is false n n n A and B can be composite If B is ‘Functional Dependent on’ A, then A ‘is the determinant of B’ All fields are functionally dependent on the primary key – or indeed any candidate key – be definition.

First Normal Form n A relation is in First Normal form if, and only

First Normal Form n A relation is in First Normal form if, and only if, it contains no multivalue or no repeating groups.

First NF (cont. ) NO Name Province Pay. Date 1 Amount 1 Pay. Date

First NF (cont. ) NO Name Province Pay. Date 1 Amount 1 Pay. Date 2 E 001 Somchai Khon Kaen 15/04/2004 5, 000. 00 30/04/2004 5, 000. 00 E 002 Sompong Sarakham 15/04/2004 4, 500. 00 30/04/2004 4, 500. 00 E 003 Somchay Ubon 15/04/2004 5, 200. 00 30/04/2004 5, 200. 00 Repeat Amount 2

Problem Multi-value Staff ENO Name Dno Dept. Name Proj. No E 001 Somchai D

Problem Multi-value Staff ENO Name Dno Dept. Name Proj. No E 001 Somchai D 01 Physic P 01, P 02 NMR, Laser E 002 Sompong D 01 Physic E 003 Somchay D 02 Computer Science P 04, P 05 Voice ordering, Speech Coding E 004 Som. Siri D 02 Computer Science P 04, P 06 Voice ordering, Speech Synthesis Problem 1. Difficult to manipulate data 2. Redundancy • UPDATE ANOMALIES P 03 Proj. Name Medical Image processing

Insert P 06 Speech Corpus Staff ENO Name Dno Dept. Name Proj. No E

Insert P 06 Speech Corpus Staff ENO Name Dno Dept. Name Proj. No E 001 Somchai D 01 Physic P 01, P 02 NMR, Laser E 002 Sompong D 01 Physic E 003 Somchay D 02 Computer Science P 04, P 05 Voice ordering, Speech Coding E 004 Som. Siri D 02 Computer Science P 04, P 06 Voice ordering, Speech Synthesis P 03 Proj. Name Medical Image processing We can not insert new project if the project has not assigned to any employee yet.

UPDATE ANOMALIES Staff ENO Name Dno Dept. Name Proj. No E 001 Somchai D

UPDATE ANOMALIES Staff ENO Name Dno Dept. Name Proj. No E 001 Somchai D 01 Physic P 01, P 02 NMR, Laser E 002 Sompong D 01 Physic E 003 Somchay D 02 Computer Science P 04, P 05 Voice ordering, Speech Coding E 004 Som. Siri D 02 Computer Science P 04, P 06 Voice ordering, Speech Synthesis P 03 Proj. Name Medical Image processing Change Proj. Name from Voice Ordering to Speech Ordering need to change all in Database

DELETE Problem Staff ENO Name Dno Dept. Name Proj. No E 001 Somchai D

DELETE Problem Staff ENO Name Dno Dept. Name Proj. No E 001 Somchai D 01 Physic P 01, P 02 NMR, Laser E 002 Sompong D 01 Physic E 003 Somchay D 02 Computer Science P 04, P 05 Voice ordering, Speech Coding E 004 Som. Siri D 02 Computer Science P 04, P 06 Voice ordering, Speech Synthesis P 03 Delete Employee E 003 Somchay Project P 03 Medical Image Processing was deleted also Proj. Name Medical Image processing

Solution n n Remove the repeating group In case of multi-valued n n n

Solution n n Remove the repeating group In case of multi-valued n n n Create new relation Columns = Key + multi-valued Take its determinant with it

Repeating group Employee ENO Name Province Pay. Date 1 E 001 Somchai Khon Kaen

Repeating group Employee ENO Name Province Pay. Date 1 E 001 Somchai Khon Kaen 15/04/2004 5, 000. 00 30/04/2004 5, 000. 00 E 002 Sompong Sarakham 15/04/2004 4, 500. 00 30/04/2004 4, 500. 00 E 003 Somchay Ubon 15/04/2004 5, 200. 00 30/04/2004 5, 200. 00 Amount 2 Pay. Check Employee ENO Amount 1 Pay. Date 2 Name Province ENO Pay. Date Amount E 001 15/04/2004 5, 000. 00 E 001 Somchai Khon Kaen E 001 30/04/2004 5, 000. 00 E 002 Sompong Sarakham E 002 15/04/2004 4, 500. 00 E 003 Somchay Ubon E 002 30/04/2004 4, 500. 00 E 003 15/04/2004 5, 200. 00 E 003 30/04/2004 5, 200. 00

Multi-Valued Staff ENO Name E 001 Somchai E 002 Dno Dept. Name Proj. No

Multi-Valued Staff ENO Name E 001 Somchai E 002 Dno Dept. Name Proj. No Proj. Name D 01 Physic P 01, P 02 NMR, Laser Sompong D 01 Physic P 03 E 003 Somchay D 02 Computer Science P 04, P 05 Voice ordering, Speech Coding E 004 Som. Siri D 02 Computer Science P 04, P 06 Voice ordering, Speech Synthesis Medical Image processing

Multi-Valued ENO Name Dno Dept. Name Proj. No Proj. Name E 001 Somchai D

Multi-Valued ENO Name Dno Dept. Name Proj. No Proj. Name E 001 Somchai D 01 Physic P 01 NMR E 001 Somchai D 01 Physic P 02 Laser E 002 Sompong D 01 Physic P 03 Medical Image processing E 003 Somchay D 02 Computer Science P 05 Voice ordering E 003 Somchay D 02 Computer Science P 04 Speech Coding E 004 Som. Siri D 02 Computer Science P 04 Voice ordering E 004 Som. Siri D 02 Computer Science P 06 Speech Synthesis Insert Project still has problem

Second Normal Form (2 NF) n A relation is in first normal form if

Second Normal Form (2 NF) n A relation is in first normal form if and only if n n It is in 1 NF Every non-key attribute is dependent on all parts of the primary key.

Staff 2 NF ? ENO Name Dno E 001 Somchai D 01 Physic P

Staff 2 NF ? ENO Name Dno E 001 Somchai D 01 Physic P 01 NMR E 001 Somchai D 01 Physic P 02 Laser E 002 Sompong D 01 Physic P 03 Medical Image processing E 003 Somchay D 02 Computer Science P 05 Voice ordering E 003 Somchay D 02 Computer Science P 04 Speech Coding E 004 Som. Siri D 02 Computer Science P 04 Voice ordering E 004 Som. Siri D 02 Computer Science P 06 Speech Synthesis KEY = ENO + Proj. No Dept. Name Proj. No Proj. Name Answer is No. Because Proj. No is dependent on Proj. No. (not all part of Key)

Problem ENO Name Dno Dept. Name Proj. No Proj. Name E 001 Somchai D

Problem ENO Name Dno Dept. Name Proj. No Proj. Name E 001 Somchai D 01 Physic P 01 NMR E 001 Somchai D 01 Physic P 02 Laser E 002 Sompong D 01 Physic P 03 Medical Image processing E 003 Somchay D 02 Computer Science P 05 Voice ordering E 003 Somchay D 02 Computer Science P 04 Speech Coding E 004 Som. Siri D 02 Computer Science P 04 Voice ordering E 004 Som. Siri D 02 Computer Science P 06 Speech Synthesis We can not insert Project if have not yet assigned project to any employee

Solution n n Remove the attribute involved Take its determinant with it

Solution n n Remove the attribute involved Take its determinant with it

Normalize ENO Name Dno Dept. Name Proj. No Proj. Name E 001 Somchai D

Normalize ENO Name Dno Dept. Name Proj. No Proj. Name E 001 Somchai D 01 Physic P 01 NMR E 001 Somchai D 01 Physic P 02 Laser E 002 Sompong D 01 Physic P 03 Medical Image processing E 003 Somchay D 02 Computer Science P 05 Voice ordering E 003 Somchay D 02 Computer Science P 04 Speech Coding E 004 Som. Siri D 02 Computer Science P 04 Voice ordering E 004 Som. Siri D 02 Computer Science P 06 Speech Synthesis

Result Project PERSON ENO Name Dno Dept. Na me E 001 Somchai D 01

Result Project PERSON ENO Name Dno Dept. Na me E 001 Somchai D 01 Physic E 003 Somchay D 02 Computer Science E 004 Som. Siri D 02 Computer Science Proj. Name No PERSON_Proj ENO Proj No P 01 NMR E 001 P 02 Laser E 001 P 02 P 03 Medical Image processing E 002 P 03 E 003 P 04 E 004 P 05 E 004 P 06 P 04 Speech Coding P 05 Voice ordering P 06 Speech Synthesis PERSON(ENO, NAME, Dno, Dept. Name) PROJECT(Proj. No, Proj. Name) PERSON_PROJ(ENO, Proj. No)

Third Normal Form n A relation is in 3 NF if, and only if:

Third Normal Form n A relation is in 3 NF if, and only if: n n n It is in 2 NF Every non-key attribute is functionally dependent upon the key. (No non-key attribute is functional dependent on another non-key attribute) Or non-key attribute no transitive dependent on key

Transitive dependent n n R(A, B, C, D) ; A is Key, others are

Transitive dependent n n R(A, B, C, D) ; A is Key, others are non- key If A → B and B → C can say A → B → C (C transitive dependent on A)

3 NF? Project PERSON ENO Name Dno Dept. Na me E 001 Somchai D

3 NF? Project PERSON ENO Name Dno Dept. Na me E 001 Somchai D 01 Physic E 003 Somchay D 02 Computer Science E 004 Som. Siri D 02 Computer Science Answer is No Because Dept. Name is dependent on Dno (has transitive dependent on key) Proj. Name No PERSON_Proj ENO Proj No P 01 NMR E 001 P 02 Laser E 001 P 02 P 03 Medical Image processing E 002 P 03 E 003 P 04 E 004 P 05 E 004 P 06 P 04 Speech Coding P 05 Voice ordering P 06 Speech Synthesis

Solution n n Remove the offending attributes Take the determinant along

Solution n n Remove the offending attributes Take the determinant along

Result PERSON ENO Name E 001 Somchai D 01 E 003 Somchay D 02

Result PERSON ENO Name E 001 Somchai D 01 E 003 Somchay D 02 E 004 Som. Siri D 02 Department Dno Project Proj. Name No PERSON_Proj ENO Proj No P 01 NMR E 001 P 02 Laser E 001 P 02 P 03 Medical Image processing E 002 P 03 E 003 P 04 E 004 P 05 E 004 P 06 Dept. Na me P 04 Speech Coding D 01 Physic P 05 Voice ordering D 02 Computer Science P 06 Speech Synthesis D 02 Computer Science

Note n n n The third normal form is often reached in practice by

Note n n n The third normal form is often reached in practice by inspection, in a single step. Its meaning seems intuitively clear; it represents a formalization of designer’s common sense. This level of normalization is widely accepted as the initial target for a design which eliminates redundancy. However, there are higher normal forms which, although less frequently invoked, highlight further redundancy problems which may affect the designer

Boyce-Codd Normal Form (BCNF) n n A relation is in BCNF if, and only

Boyce-Codd Normal Form (BCNF) n n A relation is in BCNF if, and only if, every determinant is a candidate key. BCNF is a refinement to third normal form, and tightens its duration.

Multivalued Dependence n In a relation including attribute A, B and C, B is

Multivalued Dependence n In a relation including attribute A, B and C, B is multivalued dependent on A if the set of B values matching a given A+C value pair, depends only on the A value.