Functional Dependencies Outline Functional dependencies 3 4 Rules

  • Slides: 36
Download presentation
Functional Dependencies

Functional Dependencies

Outline • Functional dependencies (3. 4) • Rules about FDs (3. 5) • Design

Outline • Functional dependencies (3. 4) • Rules about FDs (3. 5) • Design of a Relational schema (3. 6)

Relational Schema Design Conceptual Model: name Product price Relational Model: plus FD’s Normalization: Eliminates

Relational Schema Design Conceptual Model: name Product price Relational Model: plus FD’s Normalization: Eliminates anomalies Person buys name ssn

Functional Dependencies Definition: A 1, . . . , Am B 1, . .

Functional Dependencies Definition: A 1, . . . , Am B 1, . . . , Bn holds in R if: t, t’ R, (t. A 1=t’. A 1 . . . t. Am=t’. Am t. B 1=t’. B 1 . . . t. Bm=t’. Bm ) R A 1 . . . Am B 1 . . . Bm t t’ if t, t’ agree here then t, t’ agree here

Important Point! • Functional dependencies are part of the schema! • They constrain the

Important Point! • Functional dependencies are part of the schema! • They constrain the possible legal data instances. • At any point in time, the actual database may satisfy additional FD’s.

Examples Emp. ID E 0045 E 1847 E 1111 E 9999 Name Smith John

Examples Emp. ID E 0045 E 1847 E 1111 E 9999 Name Smith John Smith Mary Phone 1234 9876 1234 Position Clerk Salesrep Lawyer • Emp. ID Name, Phone, Position • Position Phone • but Phone Position

Formal definition of a key • A key is a set of attributes A

Formal definition of a key • A key is a set of attributes A 1, . . . , An s. t. for any other attribute B, A 1, . . . , An B • A minimal key is a set of attributes which is a key and for which no subset is a key • Note: book calls them superkey and key

Examples of Keys • Product(name, price, category, color) name, category price category color Keys

Examples of Keys • Product(name, price, category, color) name, category price category color Keys are: {name, category} and all supersets • Enrollment(student, address, course, room, time) student address room, time course student, course room, time Keys are: [in class]

Inference Rules for FD’s A 1 , A 2, … An B 1, B

Inference Rules for FD’s A 1 , A 2, … An B 1, B 2, … B m Splitting rule and Combing rule Is equivalent to A 1 , A 2, … An B 1 A 1 , A 2, … An B 2 … A 1 , A 2, … An Bm A 1 . . . Am B 1 . . . Bm

Inference Rules for FD’s (continued) A 1 , A 2, … An A Trivial

Inference Rules for FD’s (continued) A 1 , A 2, … An A Trivial Rule i where i = 1, 2, . . . , n A 1 Why ? . . . Am

Inference Rules for FD’s (continued) Transitive Closure Rule If A 1 , A 2,

Inference Rules for FD’s (continued) Transitive Closure Rule If A 1 , A 2, … An and B 1, B 2, … B m C , C …, C then A 1 , A 2, … An C , C …, C Why ? B , B …, B 1 2 1 1 m 2 2 p p

 • Enrollment(student, major, course, room, time) student major, course room course time What

• Enrollment(student, major, course, room, time) student major, course room course time What else can we infer ? [in class]

Closure of a set of Attributes Given a set of attributes {A 1, …,

Closure of a set of Attributes Given a set of attributes {A 1, …, An} and a set of dependencies S. Problem: find all attributes B such that: any relation which satisfies S also satisfies: A 1, …, An B The closure of {A 1, …, An}, denoted {A 1, …, An} +, is the set of all such attributes B

Closure Algorithm Start with X={A 1, …, An}. Repeat until X doesn’t change do:

Closure Algorithm Start with X={A 1, …, An}. Repeat until X doesn’t change do: if B 1, B 2, … B n B, B, …B 1 2 n C is not in X then add C to X. C is in S, and are all in X, and

Example R(A, B, C, D, E, F) A B A D B A F

Example R(A, B, C, D, E, F) A B A D B A F C E D B Closure of {A, B}: X = {A, B, } Closure of {A, F}: X = {A, F, }

Why Is the Algorithm Correct ? • Show the following by induction: – For

Why Is the Algorithm Correct ? • Show the following by induction: – For every B in X: • A 1, …, An B • Initially X = {A 1, …, An} -- holds • Induction step: B 1, …, Bm in X – Implies A 1, …, An B 1, …, Bm – We also have B 1, …, Bm C – By transitivity we have A 1, …, An C • This shows that the algorithm is sound; need to show it is complete

Relational Schema Design (or Logical Design) Main idea: • Start with some relational schema

Relational Schema Design (or Logical Design) Main idea: • Start with some relational schema • Find out its FD’s • Use them to design a better relational schema

Relational Schema Design Recall set attributes (persons with several phones): Name SSN Phone. Number

Relational Schema Design Recall set attributes (persons with several phones): Name SSN Phone. Number City Fred 123 -45 -6789 206 -555 -1234 Seattle Fred 123 -45 -6789 206 -555 -6543 Seattle Joe 987 -65 -4321 908 -555 -2121 Westfield Joe 987 -65 -4321 908 -555 -1234 Westfield SSN Name, City, but not SSN Phone. Number Anomalies: • Redundancy = repeat data • Update anomalies = Fred moves to “Bellvue” • Deletion anomalies = Fred drops all phone numbers: what is his city ?

Relation Decomposition Break the relation into two: Name SSN City Fred 123 -45 -6789

Relation Decomposition Break the relation into two: Name SSN City Fred 123 -45 -6789 Seattle Joe 987 -65 -4321 Westfield SSN Phone. Number 123 -45 -6789 206 -555 -1234 123 -45 -6789 206 -555 -6543 987 -65 -4321 908 -555 -2121 987 -65 -4321 908 -555 -1234

Relational Schema Design Conceptual Model: name Product price Relational Model: plus FD’s Normalization: Eliminates

Relational Schema Design Conceptual Model: name Product price Relational Model: plus FD’s Normalization: Eliminates anomalies Person buys name ssn

Decompositions in General R(A 1, . . . , An) Create two relations R

Decompositions in General R(A 1, . . . , An) Create two relations R 1(B 1, . . . , Bm) and R 2(C 1, . . . , Cp) such that: B 1, . . . , Bm C 1, . . . , Cp = A 1, . . . , An and: R 1 = projection of R on B 1, . . . , Bm R 2 = projection of R on C 1, . . . , Cp

Incorrect Decomposition • Sometimes it is incorrect: Name Price Category Gizmo 19. 99 Gadget

Incorrect Decomposition • Sometimes it is incorrect: Name Price Category Gizmo 19. 99 Gadget One. Click 24. 99 Camera Double. Click 29. 99 Camera Decompose on : Name, Category and Price, Category

Incorrect Decomposition Name Category Price Category Gizmo Gadget 19. 99 Gadget One. Click Camera

Incorrect Decomposition Name Category Price Category Gizmo Gadget 19. 99 Gadget One. Click Camera 24. 99 Camera Double. Click Camera 29. 99 Camera When we put it back: Cannot recover information Name Price Category Gizmo 19. 99 Gadget One. Click 24. 99 Camera One. Click 29. 99 Camera Double. Click 24. 99 Camera Double. Click 29. 99 Camera

Normal Forms First Normal Form = all attributes are atomic Second Normal Form (2

Normal Forms First Normal Form = all attributes are atomic Second Normal Form (2 NF) = old and obsolete Third Normal Form (3 NF) = this lecture Boyce Codd Normal Form (BCNF) = this lecture Others. . .

Boyce-Codd Normal Form A simple condition for removing anomalies from relations: A relation R

Boyce-Codd Normal Form A simple condition for removing anomalies from relations: A relation R is in BCNF if: Whenever there is a nontrivial dependency A 1, . . . , An B in R , {A 1, . . . , An} is a key for R In English (though a bit vague): Whenever a set of attributes of R is determining another attribute, should determine all the attributes of R.

Example Name SSN Phone. Number City Fred 123 -45 -6789 206 -555 -1234 Seattle

Example Name SSN Phone. Number City Fred 123 -45 -6789 206 -555 -1234 Seattle Fred 123 -45 -6789 206 -555 -6543 Seattle Joe 987 -65 -4321 908 -555 -2121 Westfield Joe 987 -65 -4321 908 -555 -1234 Westfield What are the dependencies? SSN Name, City What are the keys? {SSN, Phone. Number} Is it in BCNF?

Decompose it into BCNF Name SSN City Fred 123 -45 -6789 Seattle Joe 987

Decompose it into BCNF Name SSN City Fred 123 -45 -6789 Seattle Joe 987 -65 -4321 Westfield SSN Phone. Number 123 -45 -6789 206 -555 -1234 123 -45 -6789 206 -555 -6543 987 -65 -4321 908 -555 -2121 987 -65 -4321 908 -555 -1234 SSN Name, City

Summary of BCNF Decomposition Find a dependency that violates the BCNF condition: A 1

Summary of BCNF Decomposition Find a dependency that violates the BCNF condition: A 1 , A 2, … An B 1, B 2, … B m Heuristics: choose B 1 , B 2, … Bm“as large as possible” Decompose: Is there a 2 -attribute relation that is not in BCNF ? Others R 1 A’s B’s R 2 Continue until there are no BCNF violations left.

Example Decomposition Person(name, SSN, age, hair. Color, phone. Number) SSN name, age hair. Color

Example Decomposition Person(name, SSN, age, hair. Color, phone. Number) SSN name, age hair. Color Decompose in BCNF (in class): Step 1: find all keys Step 2: now decompose

Other Example • R(A, B, C, D) A • • B, B C Key:

Other Example • R(A, B, C, D) A • • B, B C Key: A, D Violations of BCNF: A B, A C, A BC Pick A BC: split into R 1(A, BC) R 2(A, D) What happens if we pick A B first ?

Correct Decompositions A decomposition is lossless if we can recover: R(A, B, C) Decompose

Correct Decompositions A decomposition is lossless if we can recover: R(A, B, C) Decompose R 1(A, B) R 2(A, C) Recover R’(A, B, C) should be the same as R(A, B, C) R’ is in general larger than R. Must ensure R’ = R

Correct Decompositions • Given R(A, B, C) s. t. A B, the decomposition into

Correct Decompositions • Given R(A, B, C) s. t. A B, the decomposition into R 1(A, B), R 2(A, C) is lossless

3 NF: A Problem with BCNF Unit Company Product FD’s: Unit Company; Company, Product

3 NF: A Problem with BCNF Unit Company Product FD’s: Unit Company; Company, Product Unit So, there is a BCNF violation, and we decompose. Unit Company Unit Product Unit Company No FDs

So What’s the Problem? Unit Company Galaga 99 Bingo UW UW Unit Galaga 99

So What’s the Problem? Unit Company Galaga 99 Bingo UW UW Unit Galaga 99 Bingo Product databases No problem so far. All local FD’s are satisfied. Let’s put all the data back into a single table again: Unit Galaga 99 Bingo Company UW UW Product databases Violates the dependency: company, product -> unit!

Solution: 3 rd Normal Form (3 NF) A simple condition for removing anomalies from

Solution: 3 rd Normal Form (3 NF) A simple condition for removing anomalies from relations: A relation R is in 3 rd normal form if : Whenever there is a nontrivial dependency A 1, A 2, . . . , An B for R , then {A 1, A 2, . . . , An } a super-key for R, or B is part of a key.