BTM 382 Database Management Chapter 5 Advanced Data

BTM 382 Database Management Chapter 5: Advanced Data Modeling Chitu Okoli Associate Professor in Business Technology Management John Molson School of Business, Concordia University, Montréal

Structure of BTM 382 Database Management § § § Week 1: Introduction and overview § ch 1: Introduction Weeks 2 -6: Database design § ch 3: Relational model § ch 4: ER modeling § ch 6: Normalization § ERD modeling exercise § ch 5: Advanced data modeling Week 7: Midterm exam Weeks 8 -10: Database programming § ch 7: Intro to SQL § ch 8: Advanced SQL § SQL exercises Weeks 11 -13: Database management § ch 2, 14: Data models § ch 13: Business intelligence and data warehousing § ch 9, 15, 16: Selected managerial topics

Review of Chapter 5: Advanced Database Modeling § How can the Extended Entity Relationship (EER) model help model entity clusters and IS-A relationships? § How do you select a good (not just legal) primary key for a table? § On which side of a relationship should you place a foreign key?

The Extended Entity Relationship Model: Entity clusters

Entity cluster § “Virtual” or “abstract” entity type used to represent multiple entities and relationships in ERD § Two primary purposes: § When you have not yet figured out the details of part of the ERD § Full ERD will be developed later § To simplify the ERD or save space § Details of the entity clusters can be displayed on separate pages § Note: When entity clusters are used, foreign keys cannot be precisely specified

The Extended Entity Relationship Model: Handling IS-A relationships

What is an IS-A relationship? § Sometimes, one kind of entity is a type of another entity modeled in the same database § Examples: § A person is a specific type of person: an Employee is an Accountant, a Doctor is a Specialist, a Student is an Undergraduate § A product is a specific type of product: a Bicycle is a type of Vehicle, a Violin is a type of Musical. Instrument, Manufacturing. Product and Maintenance. Supply are types of Product

Two ways to handle IS-A relationships § Example: Some employees are pilots, some are mechanics, some are accountants, and some are none of those specialized types § Solution 1: Leave them all in one entity, EMPLOYEE § Disadvantage: Lots of nulls § Not a crime, but can make SQL programming awkward § Solution 2: Create a supertype entity EMPLOYEE and subtype entities PILOT, MECHANIC and ACCOUNTANT

Specialization hierarchy: Supertype and subtypes § Entity supertype § Generic entity type related to one or more entity subtypes § Contains common characteristics § Entity subtype § Contains unique characteristics of each entity subtype § Specialization hierarchy § Depicts relationships between supertypes and subtypes

Specialization hierarchy: other elements § Inheritance § Subtypes are implemented with 1: 1 relationship with the supertype § Supertype’s PK becomes the subtype’s PK/FK § Subtypes inherit all attributes and relationships from supertype § Subtype discriminator § One or more attributes that determines which subtype or subtypes the supertype instance belongs to registers

Disjoint and Overlapping Constraints: One value only, or multiple values allowed? § Disjoint subtypes § Mutually exclusive: entity belongs to one and only one subtype § Implementation: one attribute only (usually one character) § Overlapping subtypes § Multiple subtype values allowed § Implementation: multiple binary (true/false) attributes

Completeness Constraint: Subtypes optional or required? § Partial completeness (subtype optional) § Some supertype occurrences are not members of any subtype § Implementation: Null values permitted § Total completeness (subtype required) § Every supertype occurrence must be member of at least one subtype § Implementation: No nulls permitted

All the specialization hierarchy constraints

SQL implementation of EERD constraints § Disjoint constraint with partial completeness § EMP_TYPE VARCHAR 2(1) CHECK (EMP_TYPE IS NULL OR EMP_TYPE IN ('P', 'M', 'A')) § Disjoint constraint with total completeness § STU_TYPE VARCHAR 2(1) NOT NULL CHECK (STU_TYPE IN ('U', 'G')) § Overlapping constraints § Overlapping constraint with partial completeness § EMP_IS_ADM VARCHAR 2(1) NOT NULL CHECK (EMP_IS_ADM IN ('Y', 'N')), EMP_IS_PROF VARCHAR 2(1) NOT NULL CHECK (EMP_IS_PROF IN ('Y', 'N')) § Overlapping constraint with total completeness § P_IS_EMP VARCHAR 2(1) NOT NULL CHECK (P_IS_EMP IN ('Y', 'N')), P_IS_STU VARCHAR 2(1) NOT NULL CHECK (P_IS_STU IN ('Y', 'N')), CONSTRAINT ENFORCE_TOTAL_OVERLAPPING_PERSON CHECK (P_IS_EMP = 'Y' OR P_IS_STU = 'Y')

Entity Integrity: Selecting Primary Keys

Natural keys vs. Surrogate keys § Natural key is a real-world identifier used to uniquely identify real-world objects § Familiar to end users and forms part of their day-to-day business vocabulary § E. g. e-mail address, SIN, username, etc. § Surrogate key is an attribute that the designer invents only for the purpose of serving as a primary key § E. g. Customer. ID with values 1, 2, 3, 4, 5, 6, …, ∞

Ideal characteristics of primary keys § When you consider all these points, then surrogate keys are generally superior to natural keys

When to Use Composite Primary Keys § Composite primary keys are useful in two cases: § As identifiers of composite entities (for implementing M: N relationships) § As identifiers of weak entities § Because they are indexed, they automatically provide benefit of ensuring that there cannot be duplicate values § Although they use two attributes instead of one, in SQL programming, you usually need both attributes anyway

Design cases • • Placement of foreign keys Historical data Fan traps Avoiding redundant relationships

Case #1: On which side of a relationship should you place a foreign key? § M: M relationship: Always turn it into two 1: M relationships § 1: M relationship: Always place the foreign key on the many side § 1: 1 relationship: § (1, 1): (0, 1): Foreign key goes on (0, 1) side § (1, 0): (0, 1): Foreign key goes on whichever side will cause fewer nulls § (1, 1): This relationship almost never exists in the real world. You probably made a mistake. Verify your cardinalities, or merge the two tables. § Exception: this relationship is sometimes used for database performance optimization

Summary of Chapter 5: Advanced Database Modeling § The Extended Entity Relationship (EER) model gives useful notation for modeling IS-A relationships § Although multiple candidate keys might legally be elected as the primary key, some can do the job better than others § Surrogate keys are often the best choice § The foreign key always goes on the many side of a 1: M relationship § For 1: 1 relationships, it’s a bit complicated


Sources § Most of the slides are adapted from Database Systems: Design, Implementation and Management by Carlos Coronel and Steven Morris. 11 th edition (2015) published by Cengage Learning. ISBN 13: 978 -1 -285 -19614 -5 § Other sources are noted on the slides themselves
- Slides: 23