Database Management Systems Chapter 3 Data Normalization Jerry
Database Management Systems Chapter 3 Data Normalization Jerry Post Copyright © 1998 1
D A T A B A S E Why Normalization? ² Need standardized data definition ª Advantages of DBMS require careful design ª Define data correctly and the rest is much easier ª It especially makes it easier to expand database later ª Method applies to most models and most DBMS ² Similar to Entity-Relationship ² Similar to Objects (without inheritance and methods) ² Goal: Define tables carefully ª Save space ª Minimize redundancy ª Protect data 2
D A T A B A S E Definitions ² Relational database: A collection of tables. ² Table: A collection of columns (attributes) describing an entity. Individual objects are stored as rows of data in the table. ² Property (attribute): a characteristic or descriptor of a class or entity. ² Every table has a primary key. ª The smallest set of columns that uniquely identifies any row ª Primary keys can span more than one column (concatenated keys) ª We often create a primary key to insure uniqueness (e. g. , Customer. ID, Product#, . . . ) called a surrogate key. Primary key Rows/Objects Employee. ID Taxpayer. ID 12512 888 -22 -5552 15293 222 -55 -3737 22343 293 -87 -4343 29387 837 -36 -2933 Properties Class: Employee Last. Name Cartom Venetiaan Johnson Stenheim First. Name Abdul Roland John Susan Home. Phone (603) 323 -9893 (804) 888 -6667 (703) 222 -9384 (410) 330 -9837 Address 252 South Street 937 Paramaribo Lane 234 Main Street 8934 W. Maple 3
D A T A B A S E Keys ² Primary key ª Every table (object) must have a primary key ª Uniquely identifies a row (one-to-one) ² Concatenated (or composite) key ª Multiple columns needed for primary key ª Identify repeating relationships (1 : M or M : N) ² Key columns are underlined ² First step ª Collect user documents ª Identify possible keys: unique or repeating relationships 4
D A T A B A S E Notation Table columns Table name Customer(Customer. ID, Phone, Name, Address, City, State, Zip. Code) Primary key is underlined Customer. ID 1 2 3 4 5 6 7 8 9 10 Phone 502 -666 -7777 502 -888 -6464 502 -777 -7575 502 -333 -9494 502 -4746 616 -373 -4746 615 -888 -4474 615 -452 -1162 502 -222 -4351 502 -444 -2512 Last. Name First. Name Address City State Zipcode Johnson Martha Smith Jack Washington Elroy Adams Samuel Rabitz Victor Steinmetz Susan Lasater Les Jones Charlie Chavez Juan Rojo Maria Alvaton Bowling Green Smith’s Grove Alvaton Bowling Green Portland Castalian Springs Caneyville Cave City KY KY KY TN TN TN KY KY 125 Main Street 873 Elm Street 95 Easy Street 746 Brown Drive 645 White Avenue 15 Speedway Drive 67 S. Ray Drive 867 Lakeside Drive 673 Industry Blvd. 88 Main Street 42122 42101 42171 42122 42102 37148 37031 42721 42127 5
D A T A B A S E Identifying Key Columns Order. ID 8367 8368 Date 5 -5 -97 5 -6 -97 Customer 6794 9263 Each order has only one customer. So Customer is not part of the key. Order. Items Order. ID 8367 8368 Item 229 253 876 555 229 Quantity 2 4 1 Each order has many items. Each item can appear on many orders. So Order. ID and Item are both part of the key. 6
D A T A B A S E Common Order System Customer 1 Salesperson 1 * Order 1 * * Order. Item 1 * Item Customer(Customer. ID, Name, Address, City, Phone) Salesperson(Employee. ID, Name, Commission, Date. Hired) Order(Order. ID, Order. Date, Customer. ID, Employee. ID) Order. Item(Order. ID, Item. ID, Quantity) Item(Item. ID, Description, List. Price) 7
D A T A B A S E Client Billing Example Client Billing Client(Client. ID, Name, Address, Business. Type) Partner(Partner. ID, Name, Speciality, Office, Phone) Partner. Assignment(Partner. ID, Client. ID, Date. Acquired) Billing(Client. ID, Partner. ID, Date/Time, Item, Description, Hours, Amount. Billed) Each partner can be assigned many clients. Each client can be assigned to many partners. 8
D A T A B A S E Client Billing--Different Rules Client(Client. ID, Name, Address, Business. Type) Partner(Partner. ID, Name, Speciality, Office, Phone) combine Partner. Assignment(Partner. ID, Client. ID, Date. Acquired) Billing(Client. ID, Partner. ID, Date/Time, Item, Description, Hours, Amount. Billed) Each client is assigned to only one partner. Cannot key Partner. ID. Combine Client and Partner. Assignment tables, since they have the same key. 9
D A T A B A S E Client Billing--New Assumptions Billing Client. ID 115 295 115 Partner. ID 963 967 963 Date/Time 8 -4 -97 10: 03 8 -5 -97 11: 15 8 -8 -97 09: 30 Item 967 754 967 Description Stress analysis New Design Stress analysis Hours 2 3 2. 5 Amount. Billed $500 $750 $650 More realistic assumptions for a large firm. Each Partner may work with many clients. Each client may work with many partners. Each partner and client may work together many times. The identifying feature is the date/time of the service. What happens if you do not include Date/Time as a key? 10
D A T A B A S E Sample: Video Database Possible Keys Repeating section 11
D A T A B A S E Initial Objects ² Customers ª Key: Assign a Customer. ID ª Sample Properties © Name © Address © Phone ² Videos ª Key: Assign a Movie. ID ª Sample Properties © Title © Rental. Price © Rating © Description ² Rental. Transaction ª Event/Relationship ª Key: Assign Transaction. ID ª Sample Properties © Customer. ID © Date ² Videos. Rented ª Event/Repeating list ª Keys: Transaction. ID + Movie. ID ª Sample Properties © Video. Copy# 12
D A T A B A S E Initial Form Evaluation Rental. Form(Trans. ID, Rent. Date, Customer. ID, Phone, Name, Address, City, State, Zip. Code, (Video. ID, Copy#, Title, Rent ) ) ² ² ² Collect forms from users Write down properties Find repeating groups (. . . ) Look for potential keys: key Identify computed values Notation makes it easier to identify and solve problems ² Results equivalent to diagrams, but will fit on one or two pages 13
D A T A B A S E Problems with Repeating Sections Rental. Form(Trans. ID, Rent. Date, Customer. ID, Phone, Name, Address, City, State, Zip. Code, (Video. ID, Copy#, Title, Rent ) ) Storing data in this raw form would not work very well. For example, repeating sections will cause problems. Repeating Section Note the duplication of data. Also, what if a customer has not yet checked out a movie--where do we store that customer’s data? Trans. ID 1 1 2 2 2 3 3 3 4 4 Rent. Date 4/18/95 4/30/95 4/18/95 4/18/95 Customer. ID 3 3 7 7 7 8 8 8 3 3 Last. Name Washington Lasater Jones Washington Phone 502 -777 -7575 615 -888 -4474 615 -452 -1162 502 -777 -7575 Address 95 Easy Street 67 S. Ray Drive 867 Lakeside Drive 95 Easy Street Causes duplication Video. ID 1 6 8 2 6 9 15 4 3 8 13 17 Copy# 2 3 1 1 1 1 1 Title 2001: A Space Odyssey Clockwork Orange Hopscotch Apocalypse Now Clockwork Orange Luggage Of The Gods Fabulous Baker Boys Boy And His Dog Blues Brothers Hopscotch Surf Nazis Must Die Witches of Eastwick Rent $1. 50 $2. 00 $1. 50 $2. 00 $2. 50 $2. 00 $1. 50 $2. 00 14
D A T A B A S E Problems with Repeating Sections ² Store repeating data ª Allocate space ª How much? © Can’t be short © Wasted space ² e. g. , How many videos will be rented at one time? ² A better definition eliminates this problem. Name Phone Address City State Zip. Code Customer Rentals Video. ID Copy# Title 1. 6 1 Clockwork Orange 2. 8 2 Hopscotch 3. {Unused Space} 4. 5. Rent 1. 50 Not in First Normal Form 15
D A T A B A S E First Normal Form Rental. Form(Trans. ID, Rent. Date, Customer. ID, Phone, Name, Address, City, State, Zip. Code, (Video. ID, Copy#, Title, Rent ) ) Rental. Form 2(Trans. ID, Rent. Date, Customer. ID, Phone, Name, Address, City, State, Zip. Code) Rental. Line(Trans. ID, Video. ID, Copy#, Title, Rent ) ² Remove repeating sections ª Split into two tables ª Bring key from main and repeating section ² Rental. Line(Trans. ID, Video. ID, Copy#, . . . ) ª Each transaction can have many videos (key Video. ID) ª Each video can be rented on many transactions (key Trans. ID) ª For each Trans. ID and Video. ID, only one Copy# (no key on Copy#) 16
D A T A B A S E Nested Repeating Sections Table (Key 1, . . . (Key 2, . . . (Key 3, . . . ) ) ) Table 1(Key 1, . . . ) Table. A (Key 1, Key 2. . . (Key 3, . . . ) ) Table 2 (Key 1, Key 2. . . ) Table 3 (Key 1, Key 2, Key 3, . . . ) ² Nested: Table (Key 1, aaa. . . (Key 2, bbb. . . (Key 3, ccc. . . ) ) ) ² First Normal Form (1 NF) ª Table 1(Key 1, aaa. . . ) ª Table 2(Key 1, Key 2, bbb. . ) ª Table 3(Key 1, Key 2, Key 3, ccc. . . ) 17
D A T A B A S E First Normal Form Problems (Data) Trans. ID 1 2 3 4 Rent. Date 4/18/95 4/30/95 4/18/95 Cust. ID 3 7 8 3 Phone 502 -777 -7575 615 -888 -4474 615 -452 -1162 502 -777 -7575 Last. Name Washington Lasater Jones Washington First. Name Elroy Les Charlie Elroy Address 95 Easy Street 67 S. Ray Drive 867 Lakeside Drive 95 Easy Street City Smith's Grove Portland Castalian Springs Smith's Grove State KY TN TN KY Zip. Code 42171 37148 37031 42171 ² 1 NF splits repeating groups ² Still have problems ª Replication ª Hidden dependency: ª If a video has not been rented yet, then what is its title? Trans. ID 1 1 2 2 2 3 3 3 4 4 Video. ID 1 6 8 2 6 9 15 4 3 8 13 17 Copy# 2 3 1 1 1 1 1 Title 2001: A Space Odyssey Clockwork Orange Hopscotch Apocalypse Now Clockwork Orange Luggage Of The Gods Fabulous Baker Boys Boy And His Dog Blues Brothers Hopscotch Surf Nazis Must Die Witches of Eastwick Rent $1. 50 $2. 00 $1. 50 $2. 00 $2. 50 $2. 00 $1. 50 $2. 00 18
D A T A B A S E Second Normal Form Definition Depends on both Trans. ID and Video. ID Rental. Line(Trans. ID, Video. ID, Copy#, Title, Rent) Depend only on Video. ID ² Each non-key column must depend on the entire key. ª Only applies to concatenated keys ª Some columns only depend on part of the key ª Split those into a new table. ² Dependence (definition) ª If given a value for the key you always know the value of the property in question, then that property is said to depend on the key. ª If you change part of a key and the questionable property does not change, then the table is not in 2 NF. 19
D A T A B A S E Second Normal Form Example Rental. Line(Trans. ID, Video. ID, Copy#, Title, Rent) Videos. Rented(Trans. ID, Video. ID, Copy#) Videos(Video. ID, Title, Rent) ² Title depends only on Video. ID ª Each Video. ID can have only one title ² Rent depends on Video. ID ª This statement is actually a business rule. ª It might be different at different stores. ª Some stores might charge a different for each video depending on the day (or time). ² Each non-key column depends on the whole key. 20
D A T A B A S E Second Normal Form Example (Data) Videos. Rented(Trans. ID, Video. ID, Copy#) Trans. ID 1 1 2 2 2 3 3 3 4 4 Video. ID 1 6 2 6 8 4 9 15 3 8 13 17 Copy# 2 3 1 1 1 1 1 Videos(Video. ID, Title, Rent) Video. ID 1 2 3 4 5 6 7 8 Title 2001: A Space Odyssey Apocalypse Now Blues Brothers Boy And His Dog Brother From Another Planet Clockwork Orange Gods Must Be Crazy Hopscotch Rent $1. 50 $2. 00 $2. 50 $2. 00 $1. 50 (Unchanged) Rental. Form 2(Trans. ID, Rent. Date, Customer. ID, Phone, Name, Address, City, State, Zip. Code) 21
D A T A B A S E Second Normal Form Problems (Data) Rental. Form 2(Trans. ID, Rent. Date, Customer. ID, Phone, Name, Address, City, State, Zip. Code) Trans. ID 1 2 3 4 Rent. Date 4/18/95 4/30/95 4/18/95 Cust. ID 3 7 8 3 Phone 502 -777 -7575 615 -888 -4474 615 -452 -1162 502 -777 -7575 Last. Name Washington Lasater Jones Washington First. Name Elroy Les Charlie Elroy Address 95 Easy Street 67 S. Ray Drive 867 Lakeside Drive 95 Easy Street City Smith's Grove Portland Castalian Springs Smith's Grove State KY TN TN KY Zip. Code 42171 37148 37031 42171 ² Even in 2 NF, problems remain ª Replication ª Hidden dependency ª If a customer has not rented a video yet, where do we store their personal data? ² Solution: split table. 22
D A T A B A S E Third Normal Form Definition Depend on Trans. ID Rental. Form 2(Trans. ID, Rent. Date, Customer. ID, Phone, Name, Address, City, State, Zip. Code) Depend only on Customer. ID ² Each non-key column must depend on nothing but the key. ª Some columns depend on columns that are not part of the key. ª Split those into a new table. ª Example: Customers name does not change for every transaction. ² Dependence (definition) ª If given a value for the key you always know the value of the property in question, then that property is said to depend on the key. ª If you change the key and the questionable property does not change, then the table is not in 3 NF. 23
D A T A B A S E Third Normal Form Example Rental. Form 2(Trans. ID, Rent. Date, Customer. ID, Phone, Name, Address, City, State, Zip. Code) Rentals(Trans. ID, Rent. Date, Customer. ID ) Customers(Customer. ID, Phone, Name, Address, City, State, Zip. Code ) ² Customer attributes depend only on Customer ID ª Split them into new table (Customer) ª Remember to leave Customer. ID in Rentals table. ª We need to be able to reconnect tables. ² 3 NF is sometimes easier to see if you identify primary objects at the start--then you would recognize that Customer was a separate object. 24
D A T A B A S E Third Normal Form Example Data Rentals(Trans. ID, Rent. Date, Customer. ID ) Trans. ID 1 2 3 4 Rent. Date 4/18/95 4/30/95 4/18/95 Customer. ID 3 7 8 3 Customers(Customer. ID, Phone, Name, Address, City, State, Zip. Code ) Customer. ID 1 2 3 4 5 6 7 8 9 10 Phone 502 -666 -7777 502 -888 -6464 502 -777 -7575 502 -333 -9494 502 -4746 615 -373 -4746 615 -888 -4474 615 -452 -1162 502 -222 -4351 502 -444 -2512 Last. Name First. Name Johnson Martha Smith Jack Washington Adams Samuel Rabitz Victor Steinmetz Susan Lasater Les Jones Charlie Chavez Juan Rojo Maria Address 125 Main Street 873 Elm Street Elroy 746 Brown Drive 645 White Avenue 15 Speedway Drive 67 S. Ray Drive 867 Lakeside Drive 673 Industry Blvd. 88 Main Street City State Zip. Code Alvaton KY 42122 Bowling Green KY 42101 95 Easy Street Smith's Grove KY Alvaton KY 42122 Bowling Green KY 42102 Portland TN 37148 Castalian Springs TN 3703 Caneyville KY 42721 Cave City KY 42127 (Unchanged) Videos. Rented(Trans. ID, Video. ID, Copy#) Videos(Video. ID, Title, Rent) 25
D A T A B A S E Third Normal Form Tables (3 NF) Rentals(Trans. ID, Rent. Date, Customer. ID ) Customers(Customer. ID, Phone, Name, Address, City, State, Zip. Code ) Videos. Rented(Trans. ID, Video. ID, Copy#) Videos(Video. ID, Title, Rent) 26
D A T A B A S E Checking Your Work (Quality Control) ² Look for one-to-many relationships. ª ª Many side should be keyed (underlined). e. g. , Videos. Rented(Trans. ID, Video. ID, . . . ). Check each column and ask if it should be 1 : 1 or 1: M. If add a key, renormalize. ² Verify no repeating sections (1 NF) ² Check 3 NF ª Check each column and ask: ª Does it depend on the whole key and nothing but the key? ² Verify that the tables can be reconnected (joined) to form the original tables (draw lines). ² Each table represents one object. ² Enter sample data--look for replication. 27
D A T A B A S E Boyce-Codd Normal Form (BCNF) ² Hidden dependency ² Example: Employee-Specialty(E#, Specialty, Manager) ª Is in 3 NF now. ² Business rules. ª ª Employee(E#, Manager) Manager(Manager, Specialty) Employee may have many specialties. Each specialty has many managers. Each manager has only one specialty. Employee has only one manager for each specialty. ² Problem is hidden relationship between manager and specialty. Employee(E#, Specialty, Manager) ª Need separate table for manager. ª But then we don’t need to repeat specialty. Manager(Manager, Specialty) ² In real life, probably accept the duplication ble a t p e (specialty listed in both tables). acc 28
D A T A B A S E Fourth Normal Form (Keys) Employee. Tasks(E#, Specialty, Task#) ² Technically, if you keyed every column, any table would be in 3 NF, which does not solve any problems. ² In some cases, there are hidden relationships between key properties. Employee. Specialty(E#, Specialty) ² Example: Specialty(Specialty, Task#) ª Employee. Tasks(E#, Specialty, Task#) ª In 3 NF (BCNF) now. ² Business Rules Employee. Tasks(E#, Specialty, Task#) ª Each employee has many specialties. ª Each specialty has many tasks. ª Tasks always depend on specialty. ² In real life, probably accept duplication, in case the business rules change, e. g. , some employees do not perform all tasks. Specialty(Specialty, Task#) ble a t p e c c a 29
D A T A B A S E No Hidden Dependencies ² The simple normalization rules: ² Each non-key column must depend on the whole key and nothing but the key. ² There must be no hidden dependencies. ² Solution: Split the table. ² Make sure you can rejoin the two pieces to recreate the original data relationships. ² For some hidden dependencies within keys, double-check the business assumption to be sure that it is realistic. Sometimes you are better off with a more flexible assumption. 30
D A T A B A S E Data Rules and Integrity Order ² Simple business rules ª Limits on data ranges © Price > 0 © Salary < 100, 000 © Date. Hired > 1/12/1995 ª Choosing from a set O# 1173 1174 1185 1190 1192 Odate 1 -4 -97 1 -5 -97 1 -8 -97 1 -9 -97 © Gender = M, F, Unknown © Jurisdiction=City, County, State, Federal … No data for this customer yet! ² Referential Integrity ª Foreign key values in one table must exist in the master table. ª Order(O#, Odate, C#, …) ª C# must exist in the customer table. C# 321 938 337 321 776 Customer C# 321 337 938 Name Jones Sanchez Carson Phone 998377388738 - … 31
D A T A B A S E SQL Foreign Key (Oracle) CREATE TABLE Order ( O# NUMBER(5) NOT NULL, Odate DATE, C# NUMBER(5) CONSTRAINT fk_Customer FOREIGN KEY (C#) REFERENCES Customer (C#) ON DELETE CASCADE ) 32
D A T A B A S E Domain-Key Normal Form (DKNF) ² DKNF is ultimate goal: table will always be in 4 NF, etc. ² Drawbacks ª No mechanical method to get to DKNF ª No guarantee a table can be converted to DKNF ² Rules ª Table => one topic ª All business rules explicitly written as domain constraints and key relationships. ª No hidden relationships. Employee(EID, Name, speciality) Business rule: An employee can have many specialties. So example is not in DKNF, since EID is not unique. 33
D A T A B A S E DKNF Examples Employee(EID, Name, Speciality) Business rule: An employee can have many specialties. Example is not in DKNF: EID is not unique. Employee(EID, Name, Speciality) Business rule: An employee has one name. Example is not DKNF: hidden relationship between EID and name. Employee(EID, Name) Employee. Specialty(EID, Speciality) 34
D A T A B A S E DKNF Examples Student(SID, Name, Major, Advisor) Advisor(FID, Name, Office, Discipline) Business rules: A student can have many advisors, but only one for each major. Faculty can only be advisors for their discipline. Not in DKNF: Primary key and hidden rule. Student(SID, Name) Advisors(SID, Major, FID) Faculty(FID, Name, Office, Discipline) DKNF: Foreign key (Major <--> Discipline) makes advisor rule explicit. 35
D A T A B A S E Effect of Business Rules Key business rules: A player can play on only one team. There is one referee per match. 36
D A T A B A S E Business Rules 1 There is one referee per match. A player can play on only one team. Match(Match. ID, Date. Played, Location, Ref. ID) Score(Match. ID, Team. ID, Score) Referee(Ref. ID, Phone, Address) Team(Team. ID, Name, Sponsor) Player(Player. ID, Name, Phone, Do. B, Team. ID) Player. Stats(Match. ID, Player. ID, Points, Penalties) Ref. ID and Team. ID are not keys in the Match and Team tables, because of the one-to-one rules. 37
D A T A B A S E Business Rules 2 There can be several referees per match. A player can play on only several teams (substitute), but only on one team per match. Match(Match. ID, Date. Played, Location, Ref. ID) Score(Match. ID, Team. ID, Score) Referee(Ref. ID, Phone, Address) Team(Team. ID, Name, Sponsor) Player(Player. ID, Name, Phone, Do. B, Team. ID) Player. Stats(Match. ID, Player. ID, Points, Penalties) To handle the many-to-many relationship, we need to make Ref. ID and Team. ID keys. But if you leave them in the same tables, the tables are not in 3 NF. Date. Played does not depend on Ref. ID. Player Name does not depend on Team. ID. 38
D A T A B A S E Business Rules 2: Normalized There can be several referees per match. A player can play on only several teams (substitute), but only on one team per match. Match(Match. ID, Date. Played, Location) Referee. Match(Match. ID, Ref. ID) Score(Match. ID, Team. ID, Score) Referee(Ref. ID, Phone, Address) Team(Team. ID, Name, Sponsor) Player(Player. ID, Name, Phone, Do. B) Player. Stats(Match. ID, Player. ID, Team. ID, Points, Penalties) 39
D A T A B A S E Converting a Class Diagram to Normalized Tables Manager Supplier 1 * Purchase * Order 1 * 1 Employee * * Item Raw Materials Assembled Components Office Supplies subtypes 40
D A T A B A S E One-to-Many Relationships 1 Supplier * Purchase * Order 1 Employee Supplier(SID, Name, Address, City, State, Zip, Phone) Employee(EID, Name, Salary, Address, …) Purchase. Order(POID, Date, SID, EID) The many side becomes a key (underlined). Each PO has one supplier and employee. (Do not key SID or EID) Each supplier can receive many POs. (Key PO) Each employee can place many POs. (Key PO) 41
D A T A B A S E One-to-Many Sample Data Supplier Purchase Order Employee 42
D A T A B A S E Many-to-Many Relationships Purchase. Order(POID, Date, SID, EID) Order 1 * * Item 1 * * POItem(POID, Item. ID, Quantity, Price. Paid) * Purchase Order POItem 1 Item(Item. ID, Description, List. Price) * 1 Item Each POID can have many Items (key/underline Item. ID). Each Item. ID can be on many POIDs (key POID). Need the new intermediate table (POItem) because: You cannot put Item. ID into Purchase. Order because Date, SID, and EID do not depend on the Item. ID. You cannot put POID into Item because Description and List. Price do not depend on POID. 43
D A T A B A S E Many-to-Many Sample Data Purchase Order POItem 44
D A T A B A S E N-ary Associations Employee Name. . . * Component Comp. ID Type Name * Assembly * Product. ID Type Name Assembly Employee. ID Comp. ID Product. ID 45
D A T A B A S E Composition Bicycle Size Model Type … Wheels Crank Bicycle Serial. Number Model. Type Components Wheel. ID Component. ID Crank. ID Category Stem. ID Description … Weight Cost Stem 46
D A T A B A S E Generalization or Subtypes Item Raw Materials Assembled Components Office Supplies Item(Item. ID, Description, List. Price) Raw. Materials(Item. ID, Weight, Strength. Rating) Assembled. Components(Item. ID, Width, Height, Depth) Office. Supplies(Item. ID, Bulk. Quantity, Discount) Add new tables for each subtype. Use the same key as the generic type (Item. ID)--one-to-one relationship. Add the attributes specific to each subtype. 47
D A T A B A S E Item Subtypes Sample Data Raw. Materials Assembled. Components Office. Supplies 48
D A T A B A S E Recursive Relationships Manager * 1 Employee(EID, Name, Salary, Address, Manager) Employee Add a manager column that contains Employee IDs. An employee can have only one manager. (Manager is not a key. ) A manager can supervise many employees. (EID is a key. ) 49
D A T A B A S E Normalization Examples Not done!!! ² Possible topics ª ª ª ª Auto repair Auto sales Department store Hair stylist HRM department Law firm Manufacturing National Park Service Personal stock portfolio Pet shop Restaurant Social club Sports team 50
D A T A B A S E Multiple Views & View Integration ² Collect multiple views ª Documents ª Reports ª Input forms ² Create normalized tables from each view ² Combine the views into one complete model. ² Keep meta-data in a data dictionary ª ª ² Example ª Federal Emergency Management Agency (FEMA). Disaster planning and relief. ª Make business assumptions as necessary, but try to keep them simple. Type of data Size Volume Usage 51
D A T A B A S E The Pet. Store: Sales Form Sales(Sale. ID, Date, Customer. ID, Name, Address, City, State, Zip, Employee. ID, Name, (Animal. ID, Name, Category, Breed, Date. Of. Birth, Gender, Registration, Color, List. Price, Sale. Price), (Item. ID, Description, Category, List. Price, Sale. Price, Quantity)) 52
D A T A B A S E The Pet. Store: Purchase Animals Animal. Order(Order. ID, Order. Date, Receive. Date, Supplier. ID, Name, Contact, Phone, Address, City, State, Zip, Employee. ID, Name, Phone, Date. Hired, (Animal. ID, Name, Category, Breed, Gender, Registration, Cost), Shipping. Cost) 53
D A T A B A S E The Pet. Store: Purchase Merchandise. Order(PONumber, Order. Date, Receive. Date, Supplier. ID, Name, Contact, Phone, Address, City, State, Zip, Employee. ID, Name, Home. Phone, (Item. ID, Description, Category, Price, Quantity. On. Hand), Shipping. Cost) 54
D A T A B A S E Pet. Store Normalization Sale(Sale. ID, Date, Customer. ID, Employee. ID) Sale. Animal(Sale. ID, Animal. ID, Sale. Price) Sale. Merchandise(Sale. ID, Item. ID, Sale. Price, Quantity) Customer(Customer. ID, Name, Address, City, State, Zip) Employee(Employee. ID, Name) Animal(Animal. ID, Name, Category, Breed, Date. Of. Birth, Gender, Registration, Color, List. Price) Merchandise(Item. ID, Description, Category, List. Price) Animal. Order(Order. ID, Order. Date, Receive. Date, Supplier. ID, Emp. ID, Ship. Cost) Animal. Order. Item(Order. ID, Animal. ID, Cost) Supplier(Supplier. ID, Name, Contact, Phone, Address, City, State, Zip) Employee(Employee. ID, Name, Phone, Date. Hired) Animal(Animal. ID, Name, Category, Breed, Gender, Registration, Cost) Merchandise. Order(PONumber, Order. Date, Receive. Date, SID, Emp. ID, Ship. Cost) Merchandise. Order. Item(PONumber, Item. ID, Quantity, Cost) Supplier(Supplier. ID, Name, Contact, Phone, Address, City, State, Zip) Employee(Employee. ID, Name, Phone) Merchandise(Item. ID, Description, Category, Quantity. On. Hand) 55
D A T A B A S E Pet. Store View Integration Sale(Sale. ID, Date, Customer. ID, Employee. ID) Sale. Animal(Sale. ID, Animal. ID, Sale. Price) Sale. Merchandise(Sale. ID, Item. ID, Sale. Price, Quantity) Customer(Customer. ID, Name, Address, City, State, Zip) Employee(Employee. ID, Name, Phone, Date. Hired) Animal(Animal. ID, Name, Category, Breed, Date. Of. Birth, Gender, Registration, Color, List. Price, Cost) Merchandise(Item. ID, Description, Category, List. Price, Quantity. On. Hand) Animal. Order(Order. ID, Order. Date, Receive. Date, Supplier. ID, Emp. ID, Ship. Cost) Animal. Order. Item(Order. ID, Animal. ID, Cost) Supplier(Supplier. ID, Name, Contact, Phone, Address, City, State, Zip) Employee(Employee. ID, Name, Phone, Date. Hired) Animal(Animal. ID, Name, Category, Breed, Gender, Registration, Cost) Merchandise. Order(PONumber, Order. Date, Receive. Date, SID, Emp. ID, Ship. Cost) Merchandise. Order. Item(PONumber, Item. ID, Quantity, Cost) Supplier(Supplier. ID, Name, Contact, Phone, Address, City, State, Zip) Employee(Employee. ID, Name, Phone) Merchandise(Item. ID, Description, Category, Quantity. On. Hand) 56
D A T A B A S E Pet. Store ERD 57
D A T A B A S E Rolling Thunder Integration Example Bicycle Assembly form. The main Employee. ID control is not stored directly, but the value is entered in the assembly column when the employee clicks the column. 58
D A T A B A S E Initial Tables for Bicycle Assembly Bicycle. Assembly( Serial. Number, Model, Construction, Frame. Size, Top. Tube, Chain. Stay, Head. Tube, Seat. Tube, Paint. ID, Paint. Color, Color. Style, Color. List, Custom. Name, Letter. Style, Emp. Frame, Emp. Paint, Build. Date, Ship. Date, (Tube, Tube. Type, Tube. Material, Tube. Description), (Comp. Category, Component. ID, Subst. ID, Prod. Number, Emp. Install, Date. Install, Quantity, QOH) ) Bicycle(Serial. Number, Model, Construction, Frame. Size, Top. Tube, Chain. Stay, Head. Tube, Seat. Tube, Paint. ID, Color. Style, Custom. Name, Letter. Style, Emp. Frame, Emp. Paint, Build. Date, Ship. Date) Paint(Paint. ID, Color. List) Bike. Tubes(Serial. Number, Tube. ID, Quantity) Tube. Material(Tube. ID, Type, Material, Description) Bike. Parts(Serial. Number, Component. ID, Subst. ID, Quantity, Date. Installed, Emp. Installed) Component(Component. ID, Prod. Number, Category, QOH) 59
D A T A B A S E Rolling Thunder: Purchase Order 60
D A T A B A S E RT Purchase Order: Initial Tables Purchase. Order(Purchase. ID, PODate, Employee. ID, First. Name, Last. Name, Manufacturer. ID, Mfg. Name, Address, Phone, Address, City. ID, Current. Balance, Ship. Receive. Date, (Component. ID, Category, Manufacturer. ID, Product. Number, Description, Price. Paid, Quantity, Receive. Quantity, Extended. Value, QOH, Extended. Received), Shipping. Cost, Discount) Purchase. Order(Purchase. ID, PODate, Employee. ID, Manufacturer. ID, Ship. Receive. Date, Shipping. Cost, Discount) Employee(Employee. ID, First. Name, Last. Name) Manufacturer(Manufacturer. ID, Name, Address, Phone, Address, City. ID, Current. Balance) City(City. ID, Name, Zip. Code) Purchase. Item(Purchase. ID, Component. ID, Quantity, Price. Paid, Received. Quantity) Component(Component. ID, Category, Manufacturer. ID, Product. Number, Description, QOH) 61
D A T A B A S E Rolling Thunder: Transactions 62
D A T A B A S E RT Transactions: Initial Tables Manufacturer. Transactions(Manufacturer. ID, Name, Phone, Contact, Balance. Due, (Trans. Date, Employee, Amount, Description) ) Manufacturer(Manufacturer. ID, Name, Phone, Contact, Balance. Due) Manufacturer. Transaction(Manufacturer. ID, Transaction. Date, Employee. ID, Amount, Description) 63
D A T A B A S E Rolling Thunder: Components 64
D A T A B A S E RT Components: Initial Tables Component. Form(Component. ID, Product, Bike. Type, Category, Length, Height, Width, Weight, List. Price, Description, QOH, Manufacturer. ID, Name, Phone, Contact, Address, Zip. Code, City. ID, City, State, Area. Code) Component(Component. ID, Product. Number, Bike. Type, Category, Length, Height, Width, Weight, List. Price, Description, QOH, Manufacturer. ID) Manufacturer(Manufacturer. ID, Name, Phone, Contact, Address, Zip. Code, City. ID) City(City. ID, City, State, Zip. Code, Area. Code) 65
D A T A B A S E RT: Integrating Tables Duplicate Manufacturer tables: PO Mfr(Manufacturer. ID, Name, Address, Phone, Address, City. ID, Current. Balance) Mfg Mfr(Manufacturer. ID, Name, Phone, Contact, Balance. Due) Comp Mfr(Manufacturer. ID, Name, Phone, Contact, Address, Zip. Code, City. ID) Note that each form can lead to duplicate tables. Look for tables with the same keys, but do not expect them to be named exactly alike. Find all of the data and combine it into one table. Manufacturer(Manufacturer. ID, Name, Contact, Address, Phone, Address, City. ID, |Zip. Code, Current. Balance) 66
D A T A B A S E RT Example: Integrated Tables Bicycle(Serial. Number, Model, Construction, Frame. Size, Top. Tube, Chain. Stay, Head. Tube, Seat. Tube, Paint. ID, Color. Style, Custom. Name, Letter. Style, Emp. Frame, Emp. Paint, Build. Date, Ship. Date) Paint(Paint. ID, Color. List) Bike. Tubes(Serial. Number, Tube. ID, Quantity) Tube. Material(Tube. ID, Type, Material, Description) Bike. Parts(Serial. Number, Component. ID, Subst. ID, Quantity, Date. Installed, Emp. Installed) Component(Component. ID, Product. Number, Bike. Type, Category, Length, Height, Width, Weight, List. Price, Description, QOH, Manufacturer. ID) Purchase. Order(Purchase. ID, PODate, Employee. ID, Manufacturer. ID, Ship. Receive. Date, Shipping. Cost, Discount) Purchase. Item(Purchase. ID, Component. ID, Quantity, Price. Paid, Received. Quantity) Employee(Employee. ID, First. Name, Last. Name) Manufacturer(Manufacturer. ID, Name, Contact, Address, Phone, Address, City. ID, Zip. Code, Current. Balance) Manufacturer. Transaction(Manufacturer. ID, Transaction. Date, Employee. ID, Amount, Description, Reference) City(City. ID, City, State, Zip. Code, Area. Code) 67
D A T A B A S E Rolling Thunder Tables 68
D A T A B A S E View Integration (FEMA Example 1) Team Roster Team# Date Formed Leader Home Base Name Response time (days) Address, C, S, Z ID Name Fax Phone Home phone Team Members/Crew Home phone Specialty Do. B SSN Salary Total Salary ² This first form is kept for each team that can be called on to help in emergencies. 69
D A T A B A S E View Integration (FEMA Example 2) Disaster Name Local Agency Political Contact HQ Location Commander Date Reported Problem Description Assigned Problem# Reported By: Verified By: Specialty Sub Prob# Category On-Site Problem Report Severity Specialty Rating Sub. Problem Details Description Action Est. Cost Total Est. Cost ² Major problems are reported to HQ to be prioritized and scheduled for correction. 70
D A T A B A S E View Integration (FEMA Example 3) Location Damage Analysis Location. ID, Address Team Leader Title Latitude, Longitude Cellular Phone Date Evaluated Repair Priority Damage Description Room Damage Descrip. Damage% Item Value $Loss Item Loss Total Estimated Damage Total ² On-site teams examine buildings and file a report on damage at that location. 71
D A T A B A S E View Integration (FEMA Example 3 a) ² Location Analysis(Location. ID, Map. Latitude, Map. Longitude, Date, Address, Damage, Priority. Repair, Leader. Phone, Leader. Title, (Room, Description, Percent. Damage, (Item, Value, Loss))) 72
D A T A B A S E View Integration (FEMA Example 4) Disaster Name Task Completion Report Disaster Rating HQ Phone Problem# Supervisor Date Sub. Problem Team# Team Specialty Completion. Status Comment Expenses Total Expenses ² Teams file task completion reports. If a task is not completed, the percentage accomplished is reported as the completion status. 73
D A T A B A S E View Integration (FEMA Example 4 a) ² Tasks. Completed(Date, Disaster. Name, Disaster. Rating, HQPhone, (Problem#, Supervisor, (Sub. Problem, Team#, Completion. Status, Comments, Expenses)) 74
D A T A B A S E DBMS Table Definition ² Enter Tables ª Columns ª Keys ª Data Types © Text © Memo © Number Ÿ Byte Ÿ Integer, Long Ÿ Single, Double © Date/Time © Currency © Auto. Number (Long) © Yes/No © OLE Object ª Descriptions ² Column Properties ª ª ª ª Format Input Mask Caption Default Validation Rule Validation Text Required & Zero Length Indexed ² Relationships ª ª ª One-to-One One-to-Many Referential Integrity Cascade Update/Delete Define before entering data 75
D A T A B A S E Data Volume ² Estimate the total size of the database. ª Current. ª Future growth. ª Guide for hardware and software purchases. ² For each table. ª Use data types to estimate the number of bytes used for each row. ª Multiply by the estimated number of rows. ² Add the value for each table to get the total size. ² For concatenated keys (and similar tables). ª Order. Items(O#, Item#, Qty) ª Hard to “know” the total number of items ordered. © Start with the total number of orders. © Multiply by the average number of items on a typical order. ² Need to know time frame or how long to keep data. ª Do we store all customer data forever? ª Do we keep all orders in the active database, or do we migrate older ones? 76
D A T A B A S E Data Volume Example Customer(C#, Name, Address, City, State, Zip) Row: 4 + 15 + 20 + 2 + 10 = 76 Order(O#, C#, Odate) Row: 4 + 8 = 16 Order. Item(O#, P#, Quantity, Sale. Price) Row: 4+4 + 8 = 20 ² Business rules ² Customer 76 * 1000 76, 000 ª Three year retention. ² Order 16 * 30, 000 480, 000 ª 1000 customers. ² Order. Item 20 * 150, 000 ª Average 10 orders per customer per 3, 000 year. ² Total 3, 556, 000 ª Average 5 items per order. 77
- Slides: 77