The Relational Data Model Why Relational Model Most

  • Slides: 67
Download presentation
The Relational Data Model

The Relational Data Model

Why Relational Model? • Most of the current DBMS are based on the relational

Why Relational Model? • Most of the current DBMS are based on the relational data model: – simplicity – mathematically based: • expressions (queries) can be analyzed by DBMS • transformed to equivalent expressions automatically (query optimization)

Basics • The relational data model is a particular way of structuring data (relations)

Basics • The relational data model is a particular way of structuring data (relations) • Each relation is a two-dimensional table: title year length film type Star Wars 1977 124 color Mighty Ducks 1991 104 color Wayne’s World 1992 95 color The relation Movies

Attributes, Schemas, and Tuples Attributes title year length film. Type 1977 124 color Mighty

Attributes, Schemas, and Tuples Attributes title year length film. Type 1977 124 color Mighty Ducks 1991 104 color Wayne’s World 1992 95 color Tuple Star Wars The relation Movies Schema: Movies(title, year, length, film. Type)

Domains Each attribute is associated to a domain. Attributes title year length film. Type

Domains Each attribute is associated to a domain. Attributes title year length film. Type 1977 124 color Mighty Ducks 1991 104 color Wayne’s World 1992 95 color Tuple Star Wars The relation Movies Schema: Movies(title, year, length, film. Type)

Requirements • Each component of a tuple is atomic (similar to atomic requirement in

Requirements • Each component of a tuple is atomic (similar to atomic requirement in E/R diagram) • Each attribute is associated with a domain

Equivalent representation of a Relation title year length film. Type Star Wars 1977 124

Equivalent representation of a Relation title year length film. Type Star Wars 1977 124 color Mighty Ducks 1991 104 color Wayne’s World 1992 95 color IS THE SAME AS year length title film. Type 1977 124 Star Wars color 1991 104 Mighty Ducks color 1992 95 Wayne’s World color Movies(title, year, length, film. Type) is equivalent to Movies(year, length, title, film. Type)

Relation Instance • A set of tuples for a relation is an instance. NOTE:

Relation Instance • A set of tuples for a relation is an instance. NOTE: An instance is not a schema. • Instance changes when – a new tuple is added – a tuple is deleted – some attributes of a tuple are modified • Relation schema does not often change but could be changed; when it changes lot of things need to be done – what?

Some Exercises acct. No type balance 12345 savings 1200 23456 savings 100 12347 checking

Some Exercises acct. No type balance 12345 savings 1200 23456 savings 100 12347 checking 25 What are the attributes? What are the tuples? Schema? Domain? first. Name last. Name id. No account Robie Banks 000 -111 12345 Lena Hood 111 -222 23456 Lennon John 3 222 -333 12347

Steps in designing a database (Remember? ) • Analysis: – What information needs to

Steps in designing a database (Remember? ) • Analysis: – What information needs to be stored? – What are the relationships between different components of the stored information? – What is the suitable database structure (or schema)? • Design the database structure (using a database design language or notation suitable for expressing design) • Implementation in DBMS once committed to the design

From E/R Diagrams to Relational Designs • Why ? – We often start out

From E/R Diagrams to Relational Designs • Why ? – We often start out with a E/R diagram and then convert to a relational model • How ? Entity sets Relationships might be Relation with the same set of attributes Relation are the keys of the connecting entity sets What about weak entity set and isa relationships? Can it be simplified?

name address year title Stars_in Movies name length Studios Owns film type address Movies

name address year title Stars_in Movies name length Studios Owns film type address Movies Stars Studio Basic instinct 1990 150 Drama Sharon Stone a 1 Universal Studio b Total recall 1989 120 Mystery Arnold Schwarzenegger a 2 Dream World a Missing something ? Basic instinct 1990 Sharon Stone Total recall 1989 Arnold Schwarzenegger Stars_in Attributes, Schemas!!! Basic instinct 1990 Universal Studio Total recall 1989 Universal Studio Owns

Entity sets to relations • Entity set E with attributes a 1, …, an

Entity sets to relations • Entity set E with attributes a 1, …, an which is not a weak entity set is converted to a schema E(a 1, …, an) • Examples: – Movies(title, year, length, film. Type) – Stars(name, address) – Studios(name, address)

Relationships to relations • R is a relationship is converted to a relation with

Relationships to relations • R is a relationship is converted to a relation with – key attributes of the involved entity sets (those connecting to R) – attributes of R • Examples: – Stars_in(title, year, name) – Owns(title, year, name) • What happens if an entity set E is involved in R more than one time? – the key attributes of E have to appeared as many times as E is involved in R; rename the attributes of E each time they are added

Example Stars Movies Contracts Studio of star Producing studio Studios Contracts(title, year, name_producing_studio, name_studio_of_star)

Example Stars Movies Contracts Studio of star Producing studio Studios Contracts(title, year, name_producing_studio, name_studio_of_star)

Combining Relations • So far: rules for converting E/R diagrams to relations • Sometimes:

Combining Relations • So far: rules for converting E/R diagrams to relations • Sometimes: not optimal F A F(A, B) E R C B R(A, C) Because C determines A D E(C, D) ER(C, D, A)

Combining Relations • E is an entity set, R is a many-one relationship from

Combining Relations • E is an entity set, R is a many-one relationship from E to F E and R can be combined into one relation • Attributes of the relation that combines E and R consists of – all attributes of E – the key attributes of F – all attributes of R • This cannot be done if R is a many-many relationship

Weak Entity Sets to Relations • W – weak entity set: three components –

Weak Entity Sets to Relations • W – weak entity set: three components – entity set W – relationship from W to another entity set (single border) – supporting relationship (double border) • the relation corresponds to W have the following attributes: – attributes of W – attributes of other entity sets that help form the key of W • if R is a relationship connecting W then attributes of R must contain the key attributes of W • if R is a supporting relationship from W to another entity set then no relation corresponds to R is needed

Example name Crews number Unit_of Studios address Crews(number, name) & Studios(name, address) Unit_of(number, name.

Example name Crews number Unit_of Studios address Crews(number, name) & Studios(name, address) Unit_of(number, name. Studio) Redundant

Subclasses to Relations • Assumptions about isa-hierarchy: – there is a root entity set

Subclasses to Relations • Assumptions about isa-hierarchy: – there is a root entity set for the hierarchy – the root’s key identifies every entity in the hierarchy – an entity set in the hierarchy might have attributes belonging to the different entity sets in the hierarchy • Several choices: – E/R viewpoint: each entity set in the hierarchy corresponds to a relation – Object-oriented viewpoint: each subtree corresponds to a relation – Use null values: the whole tree corresponds to a relation

Movies isa Murder Mysteries Cartoons isa Children isa E/R: Relations correspond to Movies, Cartoons,

Movies isa Murder Mysteries Cartoons isa Children isa E/R: Relations correspond to Movies, Cartoons, Murder_Mysteries, Children, Story_Books Object-oriented: Relations correspond to the subtrees with the root: e. g. Cartoons isa Story Books Null values: One relation: Movies

E/R style conversion year Stars Voices Cartoons title length film type Movies weapon isa

E/R style conversion year Stars Voices Cartoons title length film type Movies weapon isa Murder Mysteries Movies(title, year, length, film. Type) Murder. Mysteries(title, year, weapon) Cartoons(title, year) No relations for isa

OO conversion year Stars Voices Cartoons title length film type Movies 4 subtrees: Movies

OO conversion year Stars Voices Cartoons title length film type Movies 4 subtrees: Movies & Murder. Mysteries Movies & Cartoons All three weapon isa Murder Mysteries Movies(title, year, length, film. Type) Movies. Murder. Mysteries(title, year, length, film. Type, weapon) Movies. Cartoons(title, year, length, film. Type) Movies. All(title, year, length, film. Type, weapon) No relations for isa

Null values year Stars Voices Cartoons title length film type Movies weapon isa Murder

Null values year Stars Voices Cartoons title length film type Movies weapon isa Murder Mysteries Movies(title, year, length, film. Type, weapon) No relations for isa

Comparison • It depends, again - because each approach has its own advantages and

Comparison • It depends, again - because each approach has its own advantages and disadvantages! • Query answering: – Null values: good for queries related to several entity sets in the hierarchy – E/R: good for queries related to one single entity – OO: good for queries related to a subhierarchy • Number of relations: null – only one, E/R – number of entities, OO – exponential • Space: OO is worse, depending on specific situation: null more or less E/R

Relational Design • So far: – how to get relational designs from E/R diagrams?

Relational Design • So far: – how to get relational designs from E/R diagrams? – different approaches produce different relational designs – each design has advantages and disadvantages • Question: – Can we do relational designs directly from Yes, but much more difficult. applications? – Can we improve the relational designs? If yes, how?

Improving Relational Designs • Using functional dependencies: redesign of relations, remove redundancy • Multivalued

Improving Relational Designs • Using functional dependencies: redesign of relations, remove redundancy • Multivalued dependencies and integrity constraints: create good database schemas

Functional Dependencies (FD) • Definition: A functional dependency on a relation R is a

Functional Dependencies (FD) • Definition: A functional dependency on a relation R is a statement of the form “if two tuples of R agree on attributes A 1, …, An they must also agree on other attributes B 1, …, Bm. ” • Notation: A 1…An B 1 A 1…An B 2 … A 1…An Bm • Shorthand: A 1…An B 1…Bm Left hand side Right hand side A’s B’s t u FD in picture If the A’s of t equal the A’s of u then the B’s of t must equal the B’s of u

Example Movies(title, year, length, film. Type, studio. Name, star. Name) title year length studio.

Example Movies(title, year, length, film. Type, studio. Name, star. Name) title year length studio. Name star. Name film. Type Star Wars 1997 124 Fox Mark Hamill color Star Wars 1997 124 Fox Harrison Ford color Star Wars 1977 124 Fox Carrie Fisher color Mighty Ducks 1991 104 Disney Emilio Estevez color Wayne’s World 1992 95 Paramount Dana Carvey color Wayne’s World 1992 95 Paramount Mike Meyers color Some of the FDs: title year length title year studio. Name title year film. Type title year star. Name ? NO Shorthand: title year length film. Type studio. Name NOTE: shorthand is used to combine FDs with the same left side.

Keys of Relations • Definition: {A 1, …, An} is a key for a

Keys of Relations • Definition: {A 1, …, An} is a key for a relation R if – {A 1, …, An} functionally determines all other attributes of R, and UNIQUENESS – no subset of {A 1, …, An} functionally MINIMAL determines all other attributes of R. • What is the key for Movies(title, year, length, film. Type, studio. Name, star. Name) ?

Example Movies(title, year, length, film. Type, studio. Name, star. Name) title year length studio.

Example Movies(title, year, length, film. Type, studio. Name, star. Name) title year length studio. Name star. Name film. Type Star Wars 1997 124 Fox Mark Hamill color Star Wars 1997 124 Fox Harrison Ford color Star Wars 1977 124 Fox Carrie Fisher color Mighty Ducks 1991 104 Disney Emilio Estevez color Wayne’s World 1992 95 Paramount Dana Carvey color Wayne’s World 1992 95 Paramount Mike Meyers color FDs: title year length film. Type studio. Name Keys? {title, year} NO, cannot functionally determine ‘star. Name’ {title, year, star. Name} YES Notation: underlining attributes to specify the primary key in relation schema.

Superkeys • A set of attributes that contains a key is called a superkey.

Superkeys • A set of attributes that contains a key is called a superkey. – every key is itself a superkey – a superkey needs not be a key (not minimal)

Finding Keys for Relations • Relations are obtained from E/R diagrams • Key for

Finding Keys for Relations • Relations are obtained from E/R diagrams • Key for a relation R can be determined as follows: – R is obtained from an entity set E: a key of E is a key for R – from a binary relationship: • many-many from E to F: a key for R is the union of a key of E and a key of F • many-one from E to F: a key of E is a key for R • one-one from E to F: a key of E or a key of F can be used as a key for R – from a multiway relationship: situation dependent and need to be considered carefully

Claim: Example if a multiway relationship has an arrow to the entity E then

Claim: Example if a multiway relationship has an arrow to the entity E then there exists a key for the relation that excludes the key of E. Stars Movies Contracts Studio of star Producing studio Studios Contracts(title, year, name_producing_studio, name_studio_of_star) What can be a key for Contracts? {title, year, name_producing_studio, name_studio_of_star} ? No: the first four attributes functionally determines the last one!

Rules about Functional Dependencies • FDs are good for database schema design knowledge of

Rules about Functional Dependencies • FDs are good for database schema design knowledge of FDs needed • FDs determine the set of legal instances of a database schema • Rules about FDs: allow us to reason about FDs – checking whether a FD holds – construct new FDs

Example • Given R(A, B, C) with the FDs: A B and A C.

Example • Given R(A, B, C) with the FDs: A B and A C. – No instance of R can contain the two tuples (1, 3, 5) and (1, 3, 4) – No instance of R can contain two tuples with the same A – Does B C hold as well? NO

Rules about FDs • Two set of FDs S and T are equivalent if

Rules about FDs • Two set of FDs S and T are equivalent if the set of relation instances satisfying S is exactly the set of relation instances satisfying T • a set of FDs S follows from a set of FDs T if every relation instance that satisfies all the FDs in T also satisfies all the FDs in S

Rules about FDs A 1…An B 1 A 1…An B 2 … A 1…An

Rules about FDs A 1…An B 1 A 1…An B 2 … A 1…An Bm Combining Splitting A 1…An B 1…Bm

Trivial FDs • Given A 1…An B 1…Bm – trivial: if Bj {A 1…An}

Trivial FDs • Given A 1…An B 1…Bm – trivial: if Bj {A 1…An} for j=1, …, m title year – nontrivial: some Bj does not belong to {A 1…An} – completely nontrivial: none of thetitle. Byear to title length j belongs {A 1…An} title year length film. Type

Closure • The closure of {B 1…Bk} under the set of FDs S, denoted

Closure • The closure of {B 1…Bk} under the set of FDs S, denoted by {B 1…Bk}+, is defined as follows: {B 1…Bk}+ = {B | any relation satisfies S will also satisfies B 1…Bk B}

The meaning of closure • Assume that A 1, …, An B 1, …,

The meaning of closure • Assume that A 1, …, An B 1, …, Bm holds • Then, we know that A 1…An B 1 will hold • More: for every set {C 1, …, Ck} which is a subset of {B 1, …, Bm}, then A 1…An C 1, …, Ck A’s B’s t Cs u Cs If the A’s of t equal the A’s of u then the B’s of t must equal the B’s of u This will mean that the C’s of must equal the C’s of u

Computing the closure • • Given: the set S and {A 1, …, An}

Computing the closure • • Given: the set S and {A 1, …, An} Compute: {A 1, …, An}+- denote this set by X Step 1: X = {A 1, …, An} Step 2: find a FD B 1…Bk B in S such that {B 1, …, Bk} X and B X, then X=X {B} • Step 3: repeat step 2 until nothing more can be added to X, then go to step 4 • Step 4: return X

Example • • • S = {AB C, BC AD, D E, CF B}

Example • • • S = {AB C, BC AD, D E, CF B} Compute {A, B}+ Step 1: X = {A, B} Step 2: X = X {C}={A, B, C} because AB C Step 3 back to step 2: X = X {D} because BC AD • Step 3 back to step 2: X = X {E} because D E • Step 3 back to step 2: nothing more • Step 3 go to step 4: return {A, B, C, D, E}

Correctness of closure algorithm • It computes true functional dependencies – proof: show that

Correctness of closure algorithm • It computes true functional dependencies – proof: show that if B belongs to {A 1, …, An}+ then A 1…An B holds. By induction over the number of steps (n) used in adding an attribute B into the set X • n=0 then B belongs to {A 1, …, An} and so A 1…An B is a trivial functional dependency • n n+1: if B is added to X in the step n+1, then A 1…An Bj for all j by inductive hypothesis; this, together with B 1…Bk B, implies that A 1…An B • It computes all functional dependencies – proof: show that if B does not belong to {A 1, …, An}+ then A 1…An B does not hold. By constructing an instance I of the relation R such that the FD does not hold. A in the closure Others 111 … 111 000 … 000 111 … 111

Simple questions • What is {A 1, …, An}+ if {A 1, …, An}

Simple questions • What is {A 1, …, An}+ if {A 1, …, An} is a key of the relation? • Can {A 1, …, An}+ ={A 1, …, An}? • Does {B 1, …, Bm} {A 1, …, An} imply {B 1, …, Bm}+ {A 1, …, An}+?

Transitive Rules • Given • then A 1, …, An B 1, …, Bm

Transitive Rules • Given • then A 1, …, An B 1, …, Bm C 1, …, Ck A 1, …, An C 1, …, Ck

Closing sets of FDs • Given a set of FDs we can derive some

Closing sets of FDs • Given a set of FDs we can derive some other FDs using the rules about FDs (e. g. combining, splitting, and transitive) • For a relation R, a set of FD is called a basis for R if all other FDs of R can be derived form it. • A basis is minimal if none of its proper subsets is a basis.

Projecting FDs B’s • Given: R S – R with a set of FDs

Projecting FDs B’s • Given: R S – R with a set of FDs F – S (a new relation) is obtained by removing the attributes {B 1, …, Bm} from R • Questions: What are the FDs of S? • Answer: if A 1…An C 1…Ck is a FD of R and none of the Bs appears on the left or right side ({B 1, …, Bm} {A 1, …, An, C 1, …, Ck}= ) is a FD of S

Projecting - Example • Given R(A, B, C, D) with the FDs A B,

Projecting - Example • Given R(A, B, C, D) with the FDs A B, B C, and C D. • Remove the attribute B from R, we obtain a new relation S(A, C, D). • What are the FDs of S? – A C? – A D? – C D? We can compute this by: • Compute all the closure of every subset of {A, C, D} by using the FDs of R that do not contain B.

Homework 3. 5. 1 Consider a relation with schema R(A, B, C, D) and

Homework 3. 5. 1 Consider a relation with schema R(A, B, C, D) and FD’s AB C, C D, and D A. – What are all the nontrivial FD’s that follow from the given FD’s? List only the FDs with one attribute on the right? (5 pt) – What are the keys of R? (5 pt) – What are the superkeys but not keys? (5 pt) 3. 5. 3 Show that the following rule holds: if A 1…An B 1…Bm and C 1…Ck D 1…Dt hold then A 1…An. C 1…Ck B 1…Bm. D 1…Dt also holds. (5 pt)

For those whole like fun: 3. 5. 4 Does the following hold: – if

For those whole like fun: 3. 5. 4 Does the following hold: – if A B then B A – if AB C and A C then B C 3. 5. 8 A set of attributes is closed if X+=X. What are the FDs of a relation R(A, B, C, D) if – all sets of four attributes are closed – the only closed sets are {} and {A, B, C, D} – the closed sets are {}, {A, B, C, D} (note: the cases are considered separate) Stars: try the exercises with stars.

Design of Relational Database Schema title year length studio. Name star. Name film. Type

Design of Relational Database Schema title year length studio. Name star. Name film. Type Star Wars 1997 124 Fox Mark Hamill color Star Wars 1997 124 Fox Harrison Ford color Star Wars 1977 124 Fox Carrie Fisher color Mighty Ducks 1991 104 Disney Emilio Estevez color Wayne’s World 1992 95 Paramount Dana Carvey color Wayne’s World 1992 95 Paramount Mike Meyers color Some observations: • value of studio. Name is the same in several tuples S? E I L A ORM N • value of film. Type is also repeated A E HES T D I AVO E What wrong with it? W CAN • redundancy store the same value unnecessary several time • update anormalies an update might require several changes • deletion anormalies losing information if delete a value

Possible ways to avoid anormalies (Intuition) • The bad way: start again (Oh, no!)

Possible ways to avoid anormalies (Intuition) • The bad way: start again (Oh, no!) • The natural way: try to decompose the given relation into two or more relations that – contain the same information – avoid the anormalies

Example title year length studio. Name star. Name film. Type Star Wars 1997 124

Example title year length studio. Name star. Name film. Type Star Wars 1997 124 Fox Mark Hamill color Star Wars 1997 124 Fox Harrison Ford color Star Wars 1977 124 Fox Carrie Fisher color Mighty Ducks 1991 104 Disney Emilio Estevez color Wayne’s World 1992 95 Paramount Dana Carvey color Wayne’s World 1992 95 Paramount Mike Meyers color title year length studio. Name film. Type Star Wars 1977 124 Fox color Mighty Ducks 1991 104 Disney color Wayne’s World 1992 95 Paramount color title year star. Name Star Wars 1997 Mark Hamill Star Wars 1997 Harrison Ford Star Wars 1977 Carrie Fisher Mighty Ducks 1991 Emilio Estevez Wayne’s World 1992 Dana Carvey Wayne’s World 1992 Mike Meyers Movie. Studio. Star(title, year, length, studio. Name, star. Name, film. Type) is decomposed into 2 relations Movie. Studio(title, year, length, studio. Name, film. Type) and Stars. In(title, year, star. Name)

Decomposition Given a relation R with schema {A 1, …, An}. A decomposition of

Decomposition Given a relation R with schema {A 1, …, An}. A decomposition of R into two relations S and T with schemas {B 1, …, Bm} and {C 1, …, Ck}, respectively, such that 1. {A 1, …, An} = {B 1, …, Bm} {C 1, …, Ck} 2. The tuples in S are the projections onto {B 1, …, Bm} of all the tuples in R. 3. The tuples in T are the projections onto {C 1, …, Ck} of all the tuples in R.

Example – Projections title year length studio. Name star. Name film. Type Star Wars

Example – Projections title year length studio. Name star. Name film. Type Star Wars 1997 124 Fox Mark Hamill color Star Wars 1997 124 Fox Harrison Ford color Star Wars 1977 124 Fox Carrie Fisher color Mighty Ducks 1991 104 Disney Emilio Estevez color Wayne’s World 1992 95 Paramount Dana Carvey color Wayne’s World 1992 95 Paramount Mike Meyers color How do we come up with this decomposition? title year length studio. Name film. Type Star Wars 1977 124 Fox color Mighty Ducks 1991 104 Disney color Wayne’s World 1992 95 Paramount color title year star. Name Star Wars 1997 Mark Hamill Star Wars 1997 Harrison Ford Star Wars 1977 Carrie Fisher Mighty Ducks 1991 Emilio Estevez Wayne’s World 1992 Dana Carvey Wayne’s World 1992 Mike Meyers Movie. Studio. Star(title, year, length, studio. Name, star. Name, film. Type) is decomposed into 2 relations Movie. Studio(title, year, length, studio. Name, film. Type) and Stars. In(title, year, star. Name)

Boyce-Codd Normal Form (BCNF) • BCNF: a relation R is in BCNF iff: whenever

Boyce-Codd Normal Form (BCNF) • BCNF: a relation R is in BCNF iff: whenever there is a nontrivial FD A 1…An B for R, it is the case that {A 1, …, An} is a superkey for R. • Why this definition? Answer: if a relation is in BCNF then there is no anormaly. • Example: Movie. Studio. Star(title, year, length, studio. Name, star. Name, film. Type): not in BCNF Movie. Studio(title, year, length, studio. Name, film. Type): in BCNF Stars. In(title, year, star. Name): in BCNF

Decomposition into BCNF • Suppose that we decompose a relation R into two relations

Decomposition into BCNF • Suppose that we decompose a relation R into two relations S and T which are in BCNF. The requirements for S and T: – S and T is a decomposition of R – it is possible to reconstruct R from S and T • Will every decomposition of R satisfy these two conditions? • What are the FDs of the new relations?

Algorithm • Given a relation R with the attributes {A 1, …, An}. •

Algorithm • Given a relation R with the attributes {A 1, …, An}. • Step 1: For every nontrivial FD B 1…Bm B if {B 1, …, Bm} is a superkey then returns R (no decomposition is needed) • Step 2: Takes a nontrivial FD B 1…Bm B such that {B 1, …, Bm} is a superkey, then decomposes R into two relations S and T with the following schema: – S’s schema: {B 1, …, Bm}+ – T’s schema: {B 1, …, Bm} ({A 1, …, An}{B 1, …, Bm}+) • Repeat Step 1&2 for S and T until no decomposition is needed for every new relation; return the set of new relations as the result

Example The ‘new’ movie relation with the following attributes: {title, year, studio. Name, president,

Example The ‘new’ movie relation with the following attributes: {title, year, studio. Name, president, pres. Address} (we call this set ALL) with the FDs: {title year studio. Name, studio. Name president, president pres. Address} Only one key: {title, year} studio. Name president violated BCNF Step 2: takes studio. Name president, decomposes into – S with the schema {studio. Name}+={studio. Name, president, pres. Address} – T with the schema {studio. Name, title, year}={studio. Name} (ALL {studio. Name}+) Check: {studio. Name, title, year} is in BCNF (the first two FDs) {studio. Name, president, pres. Address} is not in BCNF Continue with the decomposition of S using president pres. Address and we get the following two relation schemas: {president, pres. Address} and {president, studio. Name} both are in BCNF. The final result: {studio. Name, title, year}, {president, pres. Address}, {president, studio. Name}

Recovering information from a decomposition • Suppose that R with the schema {A 1,

Recovering information from a decomposition • Suppose that R with the schema {A 1, …, An} is decomposed into two relations S and T according to the algorithm whose attributes are {B 1, …, Bm}+ and {B 1, …, Bm} ({A 1, …, An}{B 1, …, Bm}+) • The tuples of R can be obtained by joining all possible pairs of S and T where {B 1, …, Bm} have the same values.

Recovering … the rest of the closure the B’s others t’ (S) Join Projection

Recovering … the rest of the closure the B’s others t’ (S) Join Projection t (R) t’’ (T) {B 1, …, Bm}+ {B 1, …, Bm} {A 1, …, An}{B 1, …, Bm}+

Example – Decomposition and Recovering title year length studio. Name star. Name film. Type

Example – Decomposition and Recovering title year length studio. Name star. Name film. Type Star Wars 1997 124 Fox Mark Hamill color Star Wars 1997 124 Fox Harrison Ford color Star Wars 1977 124 Fox Carrie Fisher color Mighty Ducks 1991 104 Disney Emilio Estevez color Wayne’s World 1992 95 Paramount Dana Carvey color Wayne’s World 1992 95 Paramount Mike Meyers color title year length studio. Name film. Type Star Wars 1977 124 Fox color Mighty Ducks 1991 104 Disney color Wayne’s World 1992 95 Paramount color title year star. Name Star Wars 1997 Mark Hamill Star Wars 1997 Harrison Ford Star Wars 1977 Carrie Fisher Mighty Ducks 1991 Emilio Estevez Wayne’s World 1992 Dana Carvey Wayne’s World 1992 Mike Meyers Movie. Studio. Star(title, year, length, studio. Name, star. Name, film. Type) is not in BCNF is decomposed into 2 relations that are in BCNF: Movie. Studio(title, year, length, studio. Name, film. Type) and Stars. In(title, year, star. Name)

Some remarks • • The algorithm will stop and output a set of BCNF

Some remarks • • The algorithm will stop and output a set of BCNF relations. Not every decomposition according to the algorithm is good The FD’s for the new relations are determined by ‘projecting’. If a decomposition is based on FDs (according to the algorithm) then the recovering process will give us exactly the original relation. • If a decomposition is not based on FDs then we might not be able to recover the original relation from the new ones: – Example: R(A, B, C) with A B and we decompose it into S(A, B) and T(B, C): A 1 4 B 2 2 C 3 5 A 1 B 2 C 3 4 2 2 5 A 1 1 4 B 2 2 2 C 3 5 3 4 2 5

Third Normal Form (3 NF) • So far: if a relation is not in

Third Normal Form (3 NF) • So far: if a relation is not in BCNF then anormalies arise. • Given a relation Bookings with the attributes: – title: name of the movie – theater: name of theater where the movie is being shown – city: the city where theater is located (a tuple (m, t, c): represents the fact that movie m is shown at theater t in city c)

Bookings(title, theater, city) • The FDs of the relations: – theater city – title

Bookings(title, theater, city) • The FDs of the relations: – theater city – title city theater • theater city violates the BCNF condition, why? • decomposition yields: {theater, city} and {theater, title} • Consider the relations: Possible relations according to the FDs of each schema theater Guild city Menlo Park Menlo the titl ate e r Gu Ne recovering Violate the FD title city theater Guild title Net city Menlo Park Net Menlo

3 NF • A relaxation of the BCNF condition: a relation R is in

3 NF • A relaxation of the BCNF condition: a relation R is in 3 NF if: whenever there is a nontrivial FD A 1…An B, either {A 1, …, An} is a superkey or B is a member of some key. • Bookings(title, theater, city) is in 3 NF