7 Relational database design 7 1 First normal

7 -Relational database design

7. 1 First normal form An attribute is considered atomic (or simple) if it does not contain any meaningful smaller components. A set of name is an example of non atomic value As another example consider an attribute for product price, prod. Price. A sample value for prod. Price is $21. 03. Of course, one could decompose prod. Price into two attributes where one attribute represents the dollar component (21), and the other attribute represents the cents component (03), but our assumption here is that such a decomposition is not meaningful to the intended application or system that will make use of it. So we would consider prod. Price to be atomic because it cannot be usefully decomposed into meaningful components. A

In the 1 nf all the attributes in a relation must have atomic domains. The values in an atomic domain are indivisible units. We re-arrange the relation (table) as below, to convert it to First Normal Form. Each attribute must contain only a single value from its pre -defined domain

7. 2 pitfalls in relational design • • Repetition of information Inability to represent certain information

7. 3 functional dependencies Functional dependency is a relationship that exists when one attribute uniquely determines another attribute. If R is a relation with attributes X and Y, a functional dependency between the attributes is represented as X->Y, which specifies Y is functionally dependent on X. Here X is a determinant set and Y is a dependent attribute. Each value of X is associated precisely with one Y value. Functional dependency in a database serves as a constraint between two sets of attributes. Defining functional dependency is an important part of relational database design and contributes to aspect normalization.

For example: Suppose we have a student table with attributes: Stu_Id, Stu_Name, Stu_Age. Here Stu_Id attribute uniquely identifies the Stu_Name attribute of student table because if we know the student id we can tell the student name associated with it. This is known as functional dependency and can be written as Stu_Id->Stu_Name or in words we can say Stu_Name is functionally dependent on Stu_Id. The dependency of an attribute on a set of attributes is known as trivial functional dependency if the set of attributes includes that attribute. A ->B is trivial functional dependency if B is a subset of A.

$Consider a table with two columns Student_id and Student_Name. {Student_Id, Student_Name} -> Student_Id$

Consider a table with two columns Student_id and Student_Name. {Student_Id, Student_Name} -> Student_Id is a trivial functional dependency as Student_Id is a subset of {Student_Id, Student_Name}. That makes sense because if we know the values of Student_Id and Student_Name then the value of Student_Id can be uniquely determined. If a functional dependency X->Y holds true where Y is not a subset of X then this dependency is called non trivial Functional dependency. For example: An employee table with three attributes: emp_id, emp_name, emp_address. The following functional dependencies are non-trivial: emp_id -> emp_name (emp_name is not a subset of emp_id) emp_id -> emp_address (emp_address is not a subset of emp_id)

Consider the example, A->c is satisfied. the functional dependency C>A is not satisfied. AB->D is also satisfied

7. 3. 2 Closure of a set of functional dependencies It is not sufficient to consider the given set of functional dependencies. rather we need to consider all the functional dependencies that hold. we shall see that given a set F of functional dependencies, we can prove that certain other functional dependencies hold. wecan say that such functional dependencies are logically implied by F. Example , R(A, B, C, G, H, I) and the set of functional dependencies are A->B A->C CG->H CG->I B->H The functional dependency A->H is logically implied.

Let F be a set of functional dependencies. The closure of F denoted by F+ Armstrong’s axioms Rule 1 Reflexivity If A is a set of attributes and B is a subset of A, then A holds B. { A → B } Rule 2 Augmentation If A hold B and C is a set of attributes, then AC holds BC. {AC → BC} It means that attribute in dependencies does not change the basic dependencies. Rule 3 Transitivity If A holds B and B holds C, then A holds C. If {A → B} and {B → C}, then {A → C} A holds B {A → B} means that A functionally determines B.

Rule 4 Union If A holds B and A holds C, then A holds BC. If{A → B} and {A → C}, then {A → BC} Rule 5 Decomposition If A holds BC and A holds B, then A holds C. If{A → BC} and {A → B}, then {A → C} Rule 6 Pseudo Transitivity If A holds B and BC holds D, then AC holds D. If{A → B} and {BC → D}, then {AC → D} Example R=(A, B, C, G, H, I) and the set of functional dependencies are A->B A->C CG->H CG->I B->H List the functional dependencies using armstrong’s axioms.

A->H (A->B, B->H transitivity rule) CG->HI (CG->H, CG->I union rule) AG->I (A->C, CG->I pseudotransitivity rule) AG->H Example 2) Consider relation E = (P, Q, R, S, T, U) having set of Functional Dependencies (FD). P→Q P→R QR → S Q→T QR → U PR → U Answer: 1. P → T 2. PR → S 3. QR → SU 4. PR → SU

1. P → T In the above FD set, P → Q and Q → T So, Using Transitive Rule: If {A → B} and {B → C}, then {A → C} ∴ If P → Q and Q → T, then P → T. P→T 2. PR → S In the above FD set, P → Q As, QR → S So, Using Pseudo Transitivity Rule: If{A → B} and {BC → D}, then {AC → D} ∴ If P → Q and QR → S, then PR → S 3. QR → SU In above FD set, QR → S and QR → U So, Using Union Rule: If{A → B} and {A → C}, then {A → BC} ∴ If QR → S and QR → U, then QR → SU

4. PR → SU So, Using Pseudo Transitivity Rule: If{A → B} and {BC → D}, then {AC → D} ∴ If PR → S and PR → U, then PR → SU

7. 3. 3 closure of attribute sets To test whether a set of S is super key, we must advise an algorithm for computing the set of functionally determined by S. One way to do this is to compute F+ take all the functional dependencies with S as the left-hand side take the union of the right hand side of all the dependencies but it is expensive since F+ can be large So its better to find closure of set of attributes. Let S be a set of attribute, we find closure of S under F. denote it by S+ The input is set of F functional dependencies and set S of attribute the output is stored in the variable result.

Compute AG+=(A, G, B, C, H, I) Assume a relation R=(a, b, c) with the fd (a->b, b->c)find the closure of a+ A+=(a, b, c) Find the attribute closures of given FDs R(ABCDE) = {AB->C, B ->D, C->E, D->A} find B+ Ans= {B, D, A, C, E}

Finding Candidate key: 1)R(A, B, C, D, E, F, G, H) AB->C A->DE B->F F->GH Candidate key is AB 2)R(A, B, C, D, E, F, G, H) AB->C BD->EF AD->G A->H Candidate key is ABD

3)R(ABCDE) BC->ADE D->B Step 1: ans(C) Make combination with c Step 2: Ac, bc, ce, cd, cae Ans=(BC, CD) 4)R(A, B, C, D, E) AB->cd D->A B->DE //bc->de Step 1: ans(B) Step: 2 Ab, bc, bd, be Ans=(AB, BC, BD)

5) R(w, x, y, z) Z->W Y->XZ WX->Y Step 1=(closure of w, x, y, z) Step 2=(combination with wx, xz, wz Ans(y, wx, xz)

2 nd Normal Form Example R(A, B, C, D) AB->D B->c AB is candidate key Prime attribute: An attribute, which is a part of the candidate-key, is known as a prime attribute. Non prime attribute: An attribute, which is not a part of candidate-key, is said to be a non-prime attribute.

Prime attributes must be dependent on candidate key(Prime attributes) Here In our example d is dependent on both A and B but c is only dependent on B that is called partial dependency To be in 2 nd normal form it must be in 1 st normal form And also there must be not partial dependency. Covert this in 2 nd normal form Table R 1=(A, B, D) Table R 2=(B, C)

Example We see here in Student_Project relation that the prime key attributes are Stu_ID and Proj_ID. According to the rule, non-key attributes, i. e. Stu_Name and Proj_Name must be dependent upon both and not on any of the prime key attribute individually. But we find that Stu_Name can be identified by Stu_ID and Proj_Name can be identified by Proj_ID independently. This is called partial dependency, which is not allowed in Second Normal Form.

3 rd normal form For a relation to be in Third Normal Form, it must be in Second Normal form and the following must satisfy − No non-prime attribute is transitively dependent on prime key attribute. For any non-trivial functional dependency, X → A, then either ◦ X is a superkey or, ◦ A is prime attribute. ◦ ◦ ◦ Example R=(A, B, C, D) AB->C C->D Here AB is candidate key

Here D is dependent on C, both are non prime attributes so this is not partial dependency so table is already in 2 nd Nf Non prime attribute find another non prime attribute that is case of transitive dependency Table R 1(A, B, C) Table R 2(C, D)(Here c is key)

We find that in the above Student_detail relation, Stu_ID is the key and only prime key attribute. We find that City can be identified by Stu_ID as well as Zip itself. Neither Zip is a superkey nor is City a prime attribute. Additionally, Stu_ID → Zip → City, so there exists transitive dependency. To bring this relation into third normal form, we break the relation into two relations as follows −

Boyce-codd Normal Form It is an advance version of 3 NF. BCNF is stricter than 3 NF. A table complies with BCNF if it is in 3 NF and for every functional dependency X->Y, X should be the super key of the table. R(A, B, C) AB->c C->B Candidate key AB, AC Table is already in 2 nd nf because c is dependent on both AB and in case of c->b Both are prime attributes Relation is also in 3 rd nf because all are prime attribute.

Suppose there is a company wherein employees work in more than one department. They store the data like this: Functional dependencies in the table above: emp_id -> emp_nationality emp_dept -> {dept_type, dept_no_of_emp}

$Candidate key: {emp_id, emp_dept} The table is not in BCNF as neither emp_id$

Candidate key: {emp_id, emp_dept} The table is not in BCNF as neither emp_id nor emp_dept alone are keys. To make the table comply with BCNF we can break the table in three tables like this: emp_nationality table:

emp_dept_mapping table: Candidate keys: For first table: emp_id For second table: emp_dept For third table: {emp_id, emp_dept} This is now in BCNF as in both the functional dependencies left side part is a key.

Canonical cover Suppose that we have set of functional dependencies on f on a relation schema when ever a user performs an update on relation , the database system must ensure that the update does not violate any fuctional dependencies that is all the functional dependency in F are satisfied in the new database state. The system must roll back the update if it violate any functional dependency in the set F In canonical cover we find the redudent sets of functional dependency and remove it because their presence or absent does not make any sense to the database.

Ex. R(w, x, y, z) X->W wz->xy Y->wxz If a->b there is 3 cases of redundancy 1)on the left hand side 2)on right hand side 3)entire functional dependency may be redudedant Step 1)apply decomposing rule X->w Wz->x Wz->y Y->w Y->x Y->z

Now right hand side redundancy is not possible just because we have decompose it. Step-2 closure of left side X+=(x, w) Again compute x closure by ignoring 1 st dependency if both are same dependency is redundant x+=(x) Wz+=(w, z, x, y) wz+=(w, z, y, x) Both closure are same so we remove wz->x Compute by ignoring Wz->y Wz+=(w, z) Y+=(y, w, x, z) y+=(x, z, y, w)(remove Y->w) Y+=(z, y); Y+=(yxw);

Finally we get X->w Wz->y Y->x Y->z Check for wz Wz+(w, z, y, x) W+=(w) z+=(Z) Ans is x->w, wz->y, y->zx

Decomposition A relation r can be decomposed into a collection of relation schemes to eliminate some of the anamolies in original relation R. 1)lossless join decomposition: Let r is a relation and has a set of fds ‘f’ over R. The decomposition of R into r 1 and r 2 lossless if natural join is equals to R 2)lossy decomposition: Contain extra tuples.

Dependency preserving decomposition The decomposition of relation r with fd , f into r 1 and r 2 with fds f 1 and f 2 is said to be dependency preserving if F+=f 1 U f 2+ Example R(A, B, C) FD=(A->B, B->C) Decoposition of r r 1(A, C) r 2=(B, C) Step 1 Compute closure of attribute A+=(a, b, c) B+=(b, c)

So functional dependency can be hold in r 1=a->c in r 2=b->c Compute f+ in r 1 A->A A->C C->C AC->AC For r 2 B->B B->C C->C BC->BC Find non trivial functional dependency A->c Both are not equals to (A->B, B->C) so it is not dependency preserving

R(A, B, C) R 1=(A, B) R 2=(B, C) FD=A->B B->C Ans is dependency presreving