Chapter 7 Relational Database Design Database System Concepts
Chapter 7: Relational Database Design Database System Concepts, 5 th Ed. ©Silberschatz, Korth and Sudarshan See www. db-book. com for conditions on re-use
Chapter 7: Relational Database Design n Functional Dependency Theory n Decomposition Using Functional Dependencies n Normal Forms n Boyce-Codd Normal Form n Database-Design Goals Database System Concepts - 5 th Edition, July 28, 2005. 7. 2 ©Silberschatz, Korth and Sudarshan
The Banking Schema n branch = (branch_name, branch_city, assets) n customer = (customer_id, customer_name, customer_street, customer_city) n loan = (loan_number, amount) n account = (account_number, balance) n employee = (employee_id. employee_name, telephone_number, start_date) n dependent_name = (employee_id, dname) n account_branch = (account_number, branch_name) n loan_branch = (loan_number, branch_name) n borrower = (customer_id, loan_number) n depositor = (customer_id, account_number) n cust_banker = (customer_id, employee_id, type) n works_for = (worker_employee_id, manager_employee_id) n payment = (loan_number, payment_date, payment_amount) n savings_account = (account_number, interest_rate) n checking_account = (account_number, overdraft_amount) Database System Concepts - 5 th Edition, July 28, 2005. 7. 3 ©Silberschatz, Korth and Sudarshan
Functional Dependencies n Let R be a relation schema R and R n The functional dependency holds on R if and only if for any legal relations r(R), whenever any two tuples t 1 and t 2 of r agree on the attributes , they also agree on the attributes . That is, t 1[ ] = t 2 [ ] n Example: Consider r(A, B ) with the following instance of r. 1 4 1 5 3 7 n On this instance, A B does NOT hold, but B A does hold. Database System Concepts - 5 th Edition, July 28, 2005. 7. 4 ©Silberschatz, Korth and Sudarshan
Functional Dependencies n Constraints on the set of legal relations. n Require that the value for a certain set of attributes determines uniquely the value for another set of attributes. n A functional dependency is a generalization of the notion of a key. n Ex) n A B C D a 1 b 1 c 1 d 1 a 1 b 2 c 1 d 2 a 2 b 2 c 2 d 2 a 2 b 3 c 2 d 3 a 3 b 3 c 2 d 4 A C C A : not satisfied AB D Trivial functional dependency if it is satisfied by all relations. Ex) A → A, AB → A In general, α → β is trivial if β ⊆ α. Database System Concepts - 5 th Edition, July 28, 2005. 7. 5 ©Silberschatz, Korth and Sudarshan
Functional Dependencies n K is a superkey for relation schema R if and only if K R n K is a candidate key for R if and only if l K R, and l for no K, R n Functional dependencies allow us to express constraints that cannot be expressed using superkeys. Consider the schema: bor_loan = (customer_id, loan_number, amount ). We expect this functional dependency to hold: loan_number amount but would not expect the following to hold: amount customer_id Database System Concepts - 5 th Edition, July 28, 2005. 7. 6 ©Silberschatz, Korth and Sudarshan
Use of Functional Dependencies n We use functional dependencies to: l test relations to see if they are legal under a given set of functional dependencies. 4 l If a relation r is legal under a set F of functional dependencies, we say that r satisfies F. specify constraints on the set of legal relations 4 We say that F holds on R if all legal relations on R satisfy the set of functional dependencies F. Database System Concepts - 5 th Edition, July 28, 2005. 7. 7 ©Silberschatz, Korth and Sudarshan
Functional Dependencies n A functional dependency is trivial if it is satisfied by all instances of a relation l l Example: 4 customer_name, loan_number customer_name 4 customer_name In general, is trivial if Database System Concepts - 5 th Edition, July 28, 2005. 7. 8 ©Silberschatz, Korth and Sudarshan
Closure of a Set of Functional Dependencies n Given a set F set of functional dependencies, there are certain other functional dependencies that are logically implied by F. l For example: If A B and B C, then we can infer that A C n The set of all functional dependencies logically implied by F is the closure of F. n We denote the closure of F by F+. n F+ is a superset of F. Database System Concepts - 5 th Edition, July 28, 2005. 7. 9 ©Silberschatz, Korth and Sudarshan
Functional-Dependency Theory n We now consider the formal theory that tells us which functional dependencies are implied logically by a given set of functional dependencies. Database System Concepts - 5 th Edition, July 28, 2005. 7. 10 ©Silberschatz, Korth and Sudarshan
Closure of a Set of Functional Dependencies n Given a set F set of functional dependencies, there are certain other functional dependencies that are logically implied by F. l For example: If A B and B C, then we can infer that A C n The set of all functional dependencies logically implied by F is the closure of F. n We denote the closure of F by F+. n We can find all of F+ by applying Armstrong’s Axioms: l if , then (reflexivity) l if , then (augmentation) l if , and , then (transitivity) n These rules are l sound (generate only functional dependencies that actually hold) and l complete (generate all functional dependencies that hold). Database System Concepts - 5 th Edition, July 28, 2005. 7. 11 ©Silberschatz, Korth and Sudarshan
Example n R = (A, B, C, G, H, I) F={ A B A C CG H CG I B H} n some members of F+ l A H 4 by l AG I 4 by l transitivity from A B and B H augmenting A C with G, to get AG CG and then transitivity with CG I CG HI 4 by augmenting CG I to infer CG CGI, and augmenting of CG H to infer CGI HI, and then transitivity Database System Concepts - 5 th Edition, July 28, 2005. 7. 12 ©Silberschatz, Korth and Sudarshan
Closure of Functional Dependencies n We can further simplify manual computation of F+ by using the following additional rules. l If holds and holds, then holds (union) l If holds, then holds and holds (decomposition) l If holds and holds, then holds (pseudotransitivity) The above rules can be inferred from Armstrong’s axioms. Database System Concepts - 5 th Edition, July 28, 2005. 7. 13 ©Silberschatz, Korth and Sudarshan
Closure of Functional Dependencies n EX) R = (A, B, C, G, H, I ) F = {A B, A C, CG H, CG I, B H} n Some members of F+: - A H (by transitivity rule) - AG I (by pseudotransitivity rule) - CG HI (by union rule) Database System Concepts - 5 th Edition, July 28, 2005. 7. 14 ©Silberschatz, Korth and Sudarshan
Combine Schemas? n Suppose we combine borrower and loan to get bor_loan = (customer_id, loan_number, amount ) n Result is possible repetition of information (L-100 in example below) n Amount: repetition; Bad!! Database System Concepts - 5 th Edition, July 28, 2005. 7. 15 ©Silberschatz, Korth and Sudarshan
A Combined Schema Without Repetition n Consider combining loan_branch and loan_amt_br = (loan_number, amount, branch_name) n No repetition (as suggested by example below): Good!! Database System Concepts - 5 th Edition, July 28, 2005. 7. 16 ©Silberschatz, Korth and Sudarshan
What About Smaller Schemas? n Suppose we had started with bor_loan. How would we know to split up (decompose) it into borrower and loan? n Write a rule “if there were a schema (loan_number, amount), then loan_number would be a candidate key” n Denote as a functional dependency: loan_number amount n In bor_loan, because loan_number is not a candidate key, the amount of a loan may have to be repeated. This indicates the need to decompose bor_loan. n Not all decompositions are good. Suppose we decompose employee into employee 1 = (employee_id, employee_name) employee 2 = (employee_name, telephone_number, start_date) n The next slide shows how we lose information -- we cannot reconstruct the original employee relation -- and so, this is a lossy decomposition. Database System Concepts - 5 th Edition, July 28, 2005. 7. 17 ©Silberschatz, Korth and Sudarshan
A Lossy Decomposition Database System Concepts - 5 th Edition, July 28, 2005. 7. 18 ©Silberschatz, Korth and Sudarshan
Decomposition n Let R be a relation schema. A set of relation schemas {R 1, R 2, …Rn} is a decomposition of R if R = R 1 R 2 … Rn n Let r be a relation an schema R, and let ri = R (r) for r=1~n. That is, i {r 1, r 2, …rn} is the database that results from decompositing R into {R 1, R 2, …Rn}. n It is always the case that In general, r r 1 r 2 … rn n A relation is legal if it satisfies all rules, or constraints, that we impose an our DB. Database System Concepts - 5 th Edition, July 28, 2005. 7. 19 ©Silberschatz, Korth and Sudarshan
Decomposition n Let C represent a set of constraints on database. A decomposition {R 1, R 2, …Rn} of a relation schema R is a lossless-join decomposition for R if, for all relations r on schema R that are legal under C, r = R 1(r) Database System Concepts - 5 th Edition, July 28, 2005. R 2(r) 7. 20 … Rn(r) ©Silberschatz, Korth and Sudarshan
Lossless-join Decomposition n For the case of R = (R 1, R 2), we require that for all possible relations r on schema R r = R 1 (r ) R 2 (r ) n A decomposition of R into R 1 and R 2 is lossless join if and only if at least one of the following dependencies is in F+: l R 1 R 2 R 1 l R 1 R 2 Database System Concepts - 5 th Edition, July 28, 2005. 7. 21 ©Silberschatz, Korth and Sudarshan
Example n R = (A, B, C) F = {A B, B C) l Can be decomposed in two different ways n R 1 = (A, B), R 2 = (B, C) l Lossless-join decomposition: R 1 R 2 = {B} and B BC l Dependency preserving n R 1 = (A, B), R 2 = (A, C) l Lossless-join decomposition: R 1 R 2 = {A} and A AB l Not dependency preserving (cannot check B C without computing R 1 Database System Concepts - 5 th Edition, July 28, 2005. 7. 22 R 2 ) ©Silberschatz, Korth and Sudarshan
정규화(Normalization) n 정규화(Normalization) - 데이터베이스의 릴레이션이나 튜플을 불필요한 중복 없이, 또한 수정 할 때 의도하지 않았던 불필요한 사항이 추가, 삭제, 변경되는 일이 없 도록 재구성하는 과정을 공식화하는 방법. n 정규형(Normal Form) - Codd's Normalization definitions l 1 NF -> 2 NF -> 3 NF(BCNF(Boyce/Codd) l 4 NF 5 NF(PJ/NF) Database System Concepts - 5 th Edition, July 28, 2005. 7. 23 ©Silberschatz, Korth and Sudarshan
정규형(Normal Form)의 예 Universe of relations ( normalized and unnormalized ) 1 NF relations ( normalized relations ) 2 NF relations 3 NF relations BCNF 4 NF relations PJ/NF ( 5 NF) relations Database System Concepts - 5 th Edition, July 28, 2005. 7. 24 ©Silberschatz, Korth and Sudarshan
Fully Functional Dependency(FFD) n Fully Functional Dependency(FFD) - If and only if Attr. Y is functional dependent on Attr. X and is not functional dependent on any proper subset of X. (그림 3참조) Database System Concepts - 5 th Edition, July 28, 2005. 7. 25 ©Silberschatz, Korth and Sudarshan
Tables Database System Concepts - 5 th Edition, July 28, 2005. 7. 26 ©Silberschatz, Korth and Sudarshan
그림 3: 릴레이션 S, P, SP와 그들의 functional dependences PNAME COLOR S# WEIGHT P# QTY P# CITY S# STATUS SNAME CITY Database System Concepts - 5 th Edition, July 28, 2005. 7. 27 ©Silberschatz, Korth and Sudarshan
제 1, 2, 3 정규형(Normal Forms) n The 1 ST Normal Form(1 ST NF) - if and only if all underlying domains contain atomic values only (any normalized relation) S# STATUS QTY P# CITY 그림 4: FIRST 릴레이션과 Functional Dependences Database System Concepts - 5 th Edition, July 28, 2005. 7. 28 ©Silberschatz, Korth and Sudarshan
1, 2, 3 Normal Forms n 그림 4의 FIRST( S#, STATUS, CITY, P#, QTY ) - Redundancy(중복): 모든 attributes in FIRST → redundancies => update anomalies 발생 n Anomalies의 예 - Insert: (S 5, Athens) insert시 primary key가 존재하지 않음. insert 불가능 - Delete: delete a tuple with (S 3, P 2) → (S 3, 10, Paris) also lost - Update: S 1 move from London to Amsterdam => find every tuples related with S 1 and change London to Amsterdam. → redundancies => update anomalies 발생 Database System Concepts - 5 th Edition, July 28, 2005. 7. 29 ©Silberschatz, Korth and Sudarshan
1, 2, 3 Normal Forms n Solution of the above anomalies - FIRST -> SECOND( S#, STATUS, CITY)와 SP(S#, P#, QTY)로 분리: 그림 5. - Insert: Primary Key S# 통해 (S 5, Athens)의 insert가능 - Delete: (S 3, P 2, 200) 삭제, (S 3, 10, Paris) 는 SECOND내 존재 - Update : (S 1, London->Amsterdam) 가능 => 따라서 그림 5는 : overcomes all the above problems : non-fully functional dependencies를 제거 Database System Concepts - 5 th Edition, July 28, 2005. 7. 30 ©Silberschatz, Korth and Sudarshan
1, 2, 3 Normal Forms STATUS S# CITY a) S# QTY P# b) 그림 5: SECOND, SP릴레이션과 그들의 Functional Dependences Database System Concepts - 5 th Edition, July 28, 2005. 7. 31 ©Silberschatz, Korth and Sudarshan
1, 2, 3 Normal Forms n The 2 ND Normal Form - 2 ND NF : if and only if 1 ST NF, every nonkey attribute is fully dependent on Primary key - SECOND: primary key ( S#)와 SP: primary key (S#, P#)는 모두 2 NF이다. - FIRST : 1 NF, not 2 NF → An equivalent collection of 2 NF relation (by suitable projection) → Reduction processing: by projection - SECOND, SP는 문제가 아직도 발생함 Database System Concepts - 5 th Edition, July 28, 2005. 7. 32 ©Silberschatz, Korth and Sudarshan
1, 2, 3 Normal Forms STATUS S# CITY 그림 6: SECOND릴레이션과 Transitive dependency S#와 STATUS는 FD이지만, S#와 STATUS는 CITY를 통하여 Transitive Dependency이다. - Anomalies in a relation SECOND의 예 (caused by dependence transitive ) → Insert ( null, 50, ROME ): primary Key ? → Delete ( S 5, Athens ) : S#를 Athens에대한 status 정보(30)까지 손실됨 Database System Concepts - 5 th Edition, July 28, 2005. 7. 33 ©Silberschatz, Korth and Sudarshan
1, 2, 3 Normal Forms → Update : Update the status for London from 20 to 30 => inconsistent result 초래 n Solution of anomalies - SECOND → SC(S#, CITY) → CS(CITY, STATUS) - overcome all above problems - eliminate the transitive dependence 그림 7: SC와 CS 릴레이션들 Database System Concepts - 5 th Edition, July 28, 2005. 7. 34 ©Silberschatz, Korth and Sudarshan
1, 2, 3 Normal Forms n The 3 RD Normal Form - 3 RD NF : if and only if 2 NDNF, every nonkey attributes is nontransitively dependent on the primary key. - SC, CS: 3 NF ( S# 와 CITY : primary keys ) SECOND: 3 NF 아님. Database System Concepts - 5 th Edition, July 28, 2005. 7. 35 ©Silberschatz, Korth and Sudarshan
Goal — Devise a Theory for the Following n Decide whether a particular relation R is in “good” form. n In the case that a relation R is not in “good” form, decompose it into a set of relations {R 1, R 2, . . . , Rn} such that l each relation is in good form l the decomposition is a lossless-join decomposition n Our theory is based on: l functional dependencies Database System Concepts - 5 th Edition, July 28, 2005. 7. 36 ©Silberschatz, Korth and Sudarshan
Dependency Preservation n Let F be a set of functional dependency on R and let R 1, R 2, …Rn be a decomposition of R. The restriction of F to Ri is the set Fi of all functional dependency in F+ that include only attributes of Ri n Ask whether testing only the restrictions is sufficient? n Let F = F 1 F 2 … Fn In general, F F However, even if F F , it may be that F+ = F + n We say that a decomposition having the property F+ = F + is a dependency preserving decomposition Database System Concepts - 5 th Edition, July 28, 2005. 7. 37 ©Silberschatz, Korth and Sudarshan
Boyce-Codd Normal Form n A relation schema R is in BCNF if for all F+, at least one of the following holds: is a trivial functional dependency(that is, ) is a superkey for R n Ex) customer(c name, c street, c city) c name c street c city BCNF Branch(b name, b city, assets) b name b city assets Loan info(b name , c name, l no, amount) l no amount b name Database System Concepts - 5 th Edition, July 28, 2005. 7. 38 not BCNF ©Silberschatz, Korth and Sudarshan
Boyce-Codd Normal Form Loan(b name, l no, amount) Borrower(c name, l no) l no amount b name 4 It is a lossless-join decomposition R 1 R 2 = {l-no} b nameamount F l no b namel no amount F+ 4 It is a BCNF l no: candidate key for Loan Database System Concepts - 5 th Edition, July 28, 2005. 7. 39 ©Silberschatz, Korth and Sudarshan
Boyce-Codd Normal Form n Not every BCNF decomposition is dependency preserving n Counter-example: Banker= (branch name, c name, banker name) banker name branch namec name banker name - Banker is not in BCNF(∵banker name is not a superkey) - BCNF decomposition: Banker branch = (banker name, branch name) Customer banker = (c name, banker name) It is lossless-join decomposition Database System Concepts - 5 th Edition, July 28, 2005. 7. 40 ©Silberschatz, Korth and Sudarshan
Boyce-Codd Normal Form But not dependency preserving Fbanker branch = {banker name branch name} Fcutomer banker = F = Fbanker branch Fcutomer banker F = {banker name branch name, branch namec name banker name} ∴ F+ F+ , so not dependency preserving Database System Concepts - 5 th Edition, July 28, 2005. 7. 41 ©Silberschatz, Korth and Sudarshan
Boyce-Codd Normal Form Example schema not in BCNF: bor_loan = ( customer_id, loan_number, amount ) because loan_number amount holds on bor_loan but loan_number is not a superkey Database System Concepts - 5 th Edition, July 28, 2005. 7. 42 ©Silberschatz, Korth and Sudarshan
Decomposing a Schema into BCNF n Suppose we have a schema R and a non-trivial dependency causes a violation of BCNF. We decompose R into: ( U ) • • (R-( - )) n In our example, l = loan_number l = amount and bor_loan is replaced by l ( U ) = ( loan_number, amount ) l ( R - ( - ) ) = ( customer_id, loan_number ) Database System Concepts - 5 th Edition, July 28, 2005. 7. 43 ©Silberschatz, Korth and Sudarshan
Goals of Normalization n Let R be a relation scheme with a set F of functional dependencies. n Decide whether a relation scheme R is in “good” form. n In the case that a relation scheme R is not in “good” form, decompose it into a set of relation scheme {R 1, R 2, . . . , Rn} such that l each relation scheme is in good form l the decomposition is a lossless-join decomposition l Preferably, the decomposition should be dependency preserving. Database System Concepts - 5 th Edition, July 28, 2005. 7. 44 ©Silberschatz, Korth and Sudarshan
Example n R = (A, B, C ) F = {A B B C} Key = {A} n R is not in BCNF (B C but B is not superkey) n Decomposition R 1 = (A, B), R 2 = (B, C) l R 1 and R 2 in BCNF l Lossless-join decomposition l Dependency preserving Database System Concepts - 5 th Edition, July 28, 2005. 7. 45 ©Silberschatz, Korth and Sudarshan
Comparison of BCNF and 3 NF n It is always possible to decompose a relation into a set of relations that are in 3 NF such that: l the decomposition is lossless l the dependencies are preserved n It is always possible to decompose a relation into a set of relations that are in BCNF such that: l the decomposition is lossless l it may not be possible to preserve dependencies. Database System Concepts - 5 th Edition, July 28, 2005. 7. 46 ©Silberschatz, Korth and Sudarshan
Design Goals n Goal for a relational database design is: l BCNF. l Lossless join. l Dependency preservation. n If we cannot achieve this, we accept l 3 NF. l Lossless join. l Dependency preservation. Database System Concepts - 5 th Edition, July 28, 2005. 7. 47 ©Silberschatz, Korth and Sudarshan
End of Chapter 7 Database System Concepts, 5 th Ed. ©Silberschatz, Korth and Sudarshan See www. db-book. com for conditions on re-use
- Slides: 48