Relational Model Normalization Relational terminology Anomalies and the




























- Slides: 28
Relational Model & Normalization Relational terminology Anomalies and the need for normalization Normal forms Relation synthesis De-normalization
Why the Relational Model? General model DBMS-independent design Widely used in DBMS products But we must deal with anomalies
Relational Terminology
Relation: schema or structure versus instance EMPLOYEE(Name, Age, Sex, Employee. Number)
When is a table a relation? Single value cells - no repeating groups or arrays Each attribute has unique name All values in a column are of same kind Order of columns is not significant No identical rows Order of rows is not significant
Approaches to Relation Design Analysis – start with table structure and normalize (eliminate anomalies) – Entity Relationship Model (3 rd Normal) Synthesis – construct relations from attributes
Basic Concepts Functional Dependency – relationship between or among attributes Key – group of one or more attributes that uniquely identifies a row
Functional Dependency Y is functionally dependent on X if value of X determines value of Y – if we know the value of X, we can obtain (look up, compute, …) the value for Y – determined by user model and business rules
Functional Dependency Example Student. ID determinant Student. Name
Functional Dependency Notation X à(Y, Z) (X, Y) àZ
Keys Single or group attributes Depend on user model Example: why {SID, Activity}? Is there another option?
Functional Dependencies, Keys and Uniqueness Key is always unique Key functionally determines entire row Determinant need not be unique, hence is not necessarily a key Example: Activity à Fee
Reality check Project. ID àEmployee. Name? Project. ID àEmployee. Salary? (Project. ID, Employee. Name) à Employee. Salary? Employee. Name à Employee. Salary? Employee. Salary àProject. ID? Employee. Salary à (Project. ID, Employee. Name)? What is the key?
Normalization Modification Anomalies Referential Integrity Constraint Normal Forms Golden Rule: “A relation should have a single theme; if it has more, break it into more relations. ”
Modification Anomalies What happens when you want to – add a new book? – change the address of a patron? – delete a patron record?
Modification Anomalies Deletion anomaly – deleting one fact about an entity deletes a fact about another entity Insertion anomaly – cannot insert one fact about an entity unless a fact about another entity is also added Update anomaly – changing one fact about an entity requires multiple changes to a table
Referential Integrity Constraint When we split a relation, we must pay attention to the references across the newly formed relations E. g. , a book must exist before it can be checked out: – CHECKOUT [Book. ID] Í BOOK [Book. ID] The DBMS or the applications will have to check/enforce constraints
Classification of relations All relations
Second Normal Form Single attribute key, or all non-key attributes are dependent on the entire key – ACTIVITY(SID, Activity, Fee)
Third Normal Form No transitive dependencies – WORKER(Employee, Dept, Location) – WORKER(Employee, Dept) OFFICE(Dept, Location)
Quick Quiz Determine if the following relations are in 1 NF, 2 NF or 3 NF Rewrite each relation in 3 NF – EMPLOYEE (Emp. ID, Emp. Name, Job. Code) – EMPLOYEE(Emp. ID, Emp. Name, Job. Code, Job. Desc) – EMPLOYEE(Emp. ID, Emp. Name, Project. ID, Hrs. Worked)
Boyce-Codd Normal Form Every determinant is a candidate key – ADVISER(SID, Major, Fname) – STU-ADV(SID, Fname) ADV-SUBJ(Fname, Subject)
Multi-valued Dependency Two or more functionally independent multivalued attributes are dependent on another attribute – EMPLOYEE(Name, Dependent, Project) Data redundancy and modification anomalies 4 NF: BCNF & no multi-valued dependencies – EMPLOYEE(Name, Dependent) – EMPLOYEE(Name, Project)
Domain/Key Normal Form Every constraint on the relation is a logical consequence of the definitions of keys and domains Constraints: rules, functional and multivalued dependencies, anything that can be statically ascertained as true or false Enforcing key and domain restrictions causes all of the constraints to be met
Summary of Normal Forms
De-Normalization Many databases are not normalized or poorly normalized implying bad design We may also want to de-normalize to improve efficiency or ease of use Consider the alternatives: – CUSTOMER(Cust. No, Cust. Name, City, State, Zip) – CUSTOMER(Cust. No, Cust. Name, Zip) CODES(Zip, City, State)
Optimization There may be more than one way to normalize a table – COLLEGE(College. Name, Dean, Asst. Dean) » DEAN(College. Name, Dean) ASSTDEAN(College. Name, Asst. Dean) » COLLEGE (College. Name, Dean, Asst. Dean 1, Asst. Dean 2, Asst. Dean 3) Which is best depends on efficiency considerations
Synthesis