Relational Model Normalization Relational terminology Anomalies and the

  • Slides: 28
Download presentation
Relational Model & Normalization Relational terminology Anomalies and the need for normalization Normal forms

Relational Model & Normalization Relational terminology Anomalies and the need for normalization Normal forms Relation synthesis De-normalization

Why the Relational Model? General model DBMS-independent design Widely used in DBMS products But

Why the Relational Model? General model DBMS-independent design Widely used in DBMS products But we must deal with anomalies

Relational Terminology

Relational Terminology

Relation: schema or structure versus instance EMPLOYEE(Name, Age, Sex, Employee. Number)

Relation: schema or structure versus instance EMPLOYEE(Name, Age, Sex, Employee. Number)

When is a table a relation? Single value cells - no repeating groups or

When is a table a relation? Single value cells - no repeating groups or arrays Each attribute has unique name All values in a column are of same kind Order of columns is not significant No identical rows Order of rows is not significant

Approaches to Relation Design Analysis – start with table structure and normalize (eliminate anomalies)

Approaches to Relation Design Analysis – start with table structure and normalize (eliminate anomalies) – Entity Relationship Model (3 rd Normal) Synthesis – construct relations from attributes

Basic Concepts Functional Dependency – relationship between or among attributes Key – group of

Basic Concepts Functional Dependency – relationship between or among attributes Key – group of one or more attributes that uniquely identifies a row

Functional Dependency Y is functionally dependent on X if value of X determines value

Functional Dependency Y is functionally dependent on X if value of X determines value of Y – if we know the value of X, we can obtain (look up, compute, …) the value for Y – determined by user model and business rules

Functional Dependency Example Student. ID determinant Student. Name

Functional Dependency Example Student. ID determinant Student. Name

Functional Dependency Notation X à(Y, Z) (X, Y) àZ

Functional Dependency Notation X à(Y, Z) (X, Y) àZ

Keys Single or group attributes Depend on user model Example: why {SID, Activity}? Is

Keys Single or group attributes Depend on user model Example: why {SID, Activity}? Is there another option?

Functional Dependencies, Keys and Uniqueness Key is always unique Key functionally determines entire row

Functional Dependencies, Keys and Uniqueness Key is always unique Key functionally determines entire row Determinant need not be unique, hence is not necessarily a key Example: Activity à Fee

Reality check Project. ID àEmployee. Name? Project. ID àEmployee. Salary? (Project. ID, Employee. Name)

Reality check Project. ID àEmployee. Name? Project. ID àEmployee. Salary? (Project. ID, Employee. Name) à Employee. Salary? Employee. Name à Employee. Salary? Employee. Salary àProject. ID? Employee. Salary à (Project. ID, Employee. Name)? What is the key?

Normalization Modification Anomalies Referential Integrity Constraint Normal Forms Golden Rule: “A relation should have

Normalization Modification Anomalies Referential Integrity Constraint Normal Forms Golden Rule: “A relation should have a single theme; if it has more, break it into more relations. ”

Modification Anomalies What happens when you want to – add a new book? –

Modification Anomalies What happens when you want to – add a new book? – change the address of a patron? – delete a patron record?

Modification Anomalies Deletion anomaly – deleting one fact about an entity deletes a fact

Modification Anomalies Deletion anomaly – deleting one fact about an entity deletes a fact about another entity Insertion anomaly – cannot insert one fact about an entity unless a fact about another entity is also added Update anomaly – changing one fact about an entity requires multiple changes to a table

Referential Integrity Constraint When we split a relation, we must pay attention to the

Referential Integrity Constraint When we split a relation, we must pay attention to the references across the newly formed relations E. g. , a book must exist before it can be checked out: – CHECKOUT [Book. ID] Í BOOK [Book. ID] The DBMS or the applications will have to check/enforce constraints

Classification of relations All relations

Classification of relations All relations

Second Normal Form Single attribute key, or all non-key attributes are dependent on the

Second Normal Form Single attribute key, or all non-key attributes are dependent on the entire key – ACTIVITY(SID, Activity, Fee)

Third Normal Form No transitive dependencies – WORKER(Employee, Dept, Location) – WORKER(Employee, Dept) OFFICE(Dept,

Third Normal Form No transitive dependencies – WORKER(Employee, Dept, Location) – WORKER(Employee, Dept) OFFICE(Dept, Location)

Quick Quiz Determine if the following relations are in 1 NF, 2 NF or

Quick Quiz Determine if the following relations are in 1 NF, 2 NF or 3 NF Rewrite each relation in 3 NF – EMPLOYEE (Emp. ID, Emp. Name, Job. Code) – EMPLOYEE(Emp. ID, Emp. Name, Job. Code, Job. Desc) – EMPLOYEE(Emp. ID, Emp. Name, Project. ID, Hrs. Worked)

Boyce-Codd Normal Form Every determinant is a candidate key – ADVISER(SID, Major, Fname) –

Boyce-Codd Normal Form Every determinant is a candidate key – ADVISER(SID, Major, Fname) – STU-ADV(SID, Fname) ADV-SUBJ(Fname, Subject)

Multi-valued Dependency Two or more functionally independent multivalued attributes are dependent on another attribute

Multi-valued Dependency Two or more functionally independent multivalued attributes are dependent on another attribute – EMPLOYEE(Name, Dependent, Project) Data redundancy and modification anomalies 4 NF: BCNF & no multi-valued dependencies – EMPLOYEE(Name, Dependent) – EMPLOYEE(Name, Project)

Domain/Key Normal Form Every constraint on the relation is a logical consequence of the

Domain/Key Normal Form Every constraint on the relation is a logical consequence of the definitions of keys and domains Constraints: rules, functional and multivalued dependencies, anything that can be statically ascertained as true or false Enforcing key and domain restrictions causes all of the constraints to be met

Summary of Normal Forms

Summary of Normal Forms

De-Normalization Many databases are not normalized or poorly normalized implying bad design We may

De-Normalization Many databases are not normalized or poorly normalized implying bad design We may also want to de-normalize to improve efficiency or ease of use Consider the alternatives: – CUSTOMER(Cust. No, Cust. Name, City, State, Zip) – CUSTOMER(Cust. No, Cust. Name, Zip) CODES(Zip, City, State)

Optimization There may be more than one way to normalize a table – COLLEGE(College.

Optimization There may be more than one way to normalize a table – COLLEGE(College. Name, Dean, Asst. Dean) » DEAN(College. Name, Dean) ASSTDEAN(College. Name, Asst. Dean) » COLLEGE (College. Name, Dean, Asst. Dean 1, Asst. Dean 2, Asst. Dean 3) Which is best depends on efficiency considerations

Synthesis

Synthesis