Database Management Systems EntityRelationship Model Database Design Why

Database Management Systems Entity-Relationship Model

Database Design • Why do we need it? – Agree on structure of the database before deciding on a particular implementation. • Consider issues such as: – What entities to model – How entities are related – What constraints exist in the domain – How to achieve good designs

Database Design Formalisms 1. Object Definition Language (ODL): – Closer in spirit to object-oriented models 2. Entity/Relationship model (E/R): – More relational in nature. • Both can be translated (semi-automatically) to relational schemas • ODL to OO-schema: direct transformation (C++ or Smalltalk based system).

Purpose of E/R Model • The E/R model allows us to sketch the design of a database informally. • Designs are pictures called entityrelationship diagrams. • Fairly mechanical ways to convert E/R diagrams to real implementations like relational databases exist. 4

Entity / Relationship Diagrams Entities Product Attributes address Relationships between entities buys

name category name price makes Company Product stockprice buys employs Person address name ssn

University Example • • • A college contains many departments Each department can offer any number of courses Many instructors can work in a department An instructor can work only in one department For each department there is a Head An instructor can be head of only one department Each instructor can take any number of courses A course can be taken by only one instructor A student can enroll for any number of courses Each course can have any number of students

Modeling • A database can be modeled as: – a collection of entities, – relationship among entities. • An entity is an object that exists and is distinguishable from other objects. – Example: person, company, event, plant • Entities have attributes – Example: people have names and addresses • An entity set is a set of entities of the same type that share the same properties. – Example: set of all persons, companies, trees, holidays

Entity Sets instructor and student instructor_ID instructor_name student-ID student_name

ER Case Study Banks Database § Each bank has a unique name. § Each branch has a number, name, address (number, street, city), and set of phones. § Customer includes their name, set of address (P. O. Box, city, zip code, country), set of phones, and social security number. § Accounts have numbers, types (e. g. saving, checking) and balance. Other branches might use the same designation for accounts. So to name an account uniquely, we need to give both the branch number to which this account belongs to and the account number. § Not all bank customers must own accounts and a customer may have at most 5 accounts in the bank. § An account must have only one customer. § A customer may have many accounts in different branches.

Relationship Sets • A relationship is an association among several entities Example: 44553 (Ahmed) advisor 22222 (Hassan) student entityrelationship set instructor entity • A relationship set is a mathematical relation among n 2 entities, each taken from entity sets {(e 1, e 2, … en) | e 1 E 1, e 2 E 2, …, en En} where (e 1, e 2, …, en) is a relationship – Example: (44553, 22222) advisor

Relationship Set advisor

Relationship Sets (Cont. ) • An attribute can also be property of a relationship set. • For instance, the advisor relationship set between entity sets instructor and student may have the attribute date which tracks when the student started being associated with the advisor

What is a Relation ? • A mathematical definition: – if A, B are sets, then a relation R is a subset of A x B 1 a • A={1, 2, 3}, B={a, b, c, d}, R = {(1, a), (1, c), (3, b)} A= b 2 c 3 B= - makes is a subset of Product x Company: Product makes Company d

Relationships • Relationships indicate a meaningful connection between two entity types • Relationships may have attributes, but they cannot have key attributes. • Identifying relationships connect a weak entity type to some other entity type – indicates where the weak entity gets a key to complete its own partial key Works. O n Relationship Dependent Of Identifying Relationship

Multiplicity of E/R Relations • one-one: a b c d 1 2 3 • many-one 1 2 3 • many-many 1 2 3 a b c d

Mapping Cardinalities One to one One to many Note: Some elements in A and B may not be mapped to any elements in the other set

Mapping Cardinalities Many to one Many to many Note: Some elements in A and B may not be mapped to any elements in the other set

One-to-One Relationship • one-to-one relationship between an instructor and a student – an instructor is associated with at most one student via advisor – and a student is associated with at most one instructor via advisor

One-to-Many Relationship • one-to-many relationship between an instructor and a student – an instructor is associated with several (including 0) students via advisor – a student is associated with at most one instructor via advisor,

Many-to-Many Relationship • An instructor is associated with several (possibly 0) students via advisor • A student is associated with several (possibly 0) instructors via advisor

Alternative Notation for Cardinality Limits n Cardinality limits can also express participation constraints

Multi-way Relationships How do we model a purchase relationship between buyers, products and stores? Product Purchase Person Can still model as a mathematical set (how ? ) Store

Relational Roles • It is sometimes convenient to name an entity’s role in a relationship. – particularly useful in recursive relationships – removes ambiguity in direction of relationship supervisor Supervisio n EMPLOYE E supervise e

Roles in Relationships What if we need an entity set twice in one relationship? Product Purchase buyer salesperson Person Store

Data Modeling Case Study • The following is description by a pharmacy owner: • “Ahmed Hassan catches a cold and what he suspects is a flu virus. He makes an appointment with his family doctor who confirm his diagnosis. The doctor prescribes an antibiotic and nasal decongestant tablets. • Ahmed leaves the doctor's office and drives to his local drug store. The pharmacist packages the medication and types the labels for pill bottles. • The label includes information about customer, the doctor who prescribe the drug, the drug (e. g. , Penicillin), when to take it, and how often, the content of the pill (250 mg), the number of refills, expiration date, and the date of purchase. " • Please develop a data model for the entities and 26 relationships within the context of pharmacy.

Attributes • An entity is represented by a set of attributes, that is descriptive properties possessed by all members of an entity set. – Example: instructor = (ID, name, street, city, salary ) course= (course_id, title, credits) • Domain – the set of permitted values for each attribute • Attribute types: – Simple and composite attributes. – Single-valued and multivalued attributes • Example: multivalued attribute: phone_numbers – Derived attributes • Can be computed from other attributes

Sets and Derived Attributes • Multivalued attributes double lined oval – multivalued = set valued – that there may be more than one value for the attribute. • Derived attributes dashed line ovals – the attribute is computed from other data Locations Multivalued Attribute Num. Employe es Derived Attribute

Composite Attributes • Composite attributes tree – composed of other attributes. – used for a set of related attributes, when the set is not a conceptual entity – the composite doesn’t have identity … it doesn’t have a key Address City Stree State t Composite Attribute Zip. Cod e

Composite Attributes


Attributes on Relationships date Product Purchase Person Store

Converting Multi-way Relationships to Binary date Product. Of Product Store. Of Store Buyer. Of Person Purchase

ER Case Study Television Series Database A Television network wishes to create a database to keep track of its TV series. A television series has one or more episode. Television series identified by name and season number, and includes their production company name and Num_of_Episodes ( i. e. total number of episodes in a specific season of a series ). Episode of a specific season of a series is identified by episode number and has a title and a length. No episode can exist without a corresponding television series. Also each episode has only one writer. A writer is identified by name, and also has birth date and a literary agency that represents him or her. An actor appears as a performer in a television series or a guest star on an episode. An actor is identified by name and also has a nationality and birth date. An actor plays a particular character in a television series or episode.

Keys in E/R Diagrams • Every entity must have a key name price Product category

Attributes and Keys • Key attributes must be unique for each entity • Keys are used to identify particular entities • Partial keys are only partially unique – used for weak entity types Age SSN Date Attribute Key Attribute Partial Key Attribute

Keys • A super key of an entity set is a set of one or more attributes whose values uniquely determine each entity. • A candidate key of an entity set is a minimal super key – ID is candidate key of instructor – course_id is candidate key of course • Although several candidate keys may exist, one of the candidate keys is selected to be the primary key.

Keys in E/R Diagrams name Underline: category price No formal way to specify multiple keys in E/R diagrams Product Person address name ssn

Entity With Composite, Multivalued, and Derived Attributes

E-R Diagrams n Rectangles represent entity sets. n Diamonds represent relationship sets. n Attributes listed inside entity rectangle n Underline indicates primary key attributes

name category name price makes Company Product stockprice buys employs Person address name ssn

Relationship Sets with Attributes

Entity Set to Relation name category price Product(name, category, price) name category price gizmo gadgets $19. 99

Classroom Design Exercise Imagine we are creating a database for a dorm, which includes a cooperative kitchen. • We want to record certain information about each resident. What? • Not all residents belong to the kitchen coop. Those that do interact in various ways: 1. They take turns at various jobs: preparer, cleanup, buyer (for supplies). No one should have two jobs on one day. 2. They may or may not be vegetarian. Each meal must have at least one vegetarian entreé. 3. They pay fees to the coop. • For each meal, there is a menu. Each menu item requires certain ingredients, which must be on hand.

Constraints in E/R Diagrams Finding constraints is part of the modeling process. Commonly used constraints: Keys: social security number uniquely identifies a person. Single-value constraints: a person can have only one father. Referential integrity constraints: if you work for a company, it must exist in the database. Other constraints: peoples’ ages are between 0 and 150.

Other Constraints <100 Product makes What does this mean ? Company

Entity Types • Entity types boxes Weak entity type double box EMPLOYEE DEPENDENT Entity Weak Entity

Weak Entity Sets Entity sets are weak when their key comes from other classes to which they are related. affiliation Team sport number University name

Weak Entity Sets Sometimes an E. S. E ’s key comes not (completely) from its own attributes, but from the keys of one or more E. S. ’s to which E is linked by a supporting many-one relationship. • Called a weak E. S. • Represented by putting double rectangle around E and a double diamond around each supporting relationship. • Many-one-ness of supporting relationship (includes 1 -1) essential. – With many-many, we wouldn't know which entity provided the key value. • “Exactly one” also essential, or else we might not be able to extract key attributes by following the supporting relationship.

Example: Logins (Email Addresses) Login name = user name + host name, e. g. , ark@soe. ucsc. edu. • A “login” entity corresponds to a user name on a particular host, but the passwd table doesn’t record the host, just the user name, e. g. , ark. • Key for a login = the user name at the host (which is unique for that host only) + the IP address of the host (which is unique globally). name Logins name @ Hosts

Example Schema DEPENDENT Dependent Of Age EID Nam e EMPLOYEE Works. O n Phone PROJECT Start. Dat e Nam e Budge t

Design Principles Setting: client has (possibly vague) idea of what he/she wants. You must design a database that represents these thoughts and only these thoughts. Avoid redundancy = saying the same thing more than once. • Wastes space and encourages inconsistency. Example Good: name product name Manf. By addr Manfs

Example Bad: repeats manufacturer address for each beer they manufacture. name manf product Manf addr Bad: manufacturer’s name said twice. name manf product name Manf. By addr Manfs

Entity Sets Vs. Attributes You may be unsure which concepts are worthy of being entity sets, and which are handled more simply as attributes. • Especially tricky for the class design project, since there is a temptation to create needless entity sets to make project “larger. ” name Wrong: Cars Right: Manf. By name manf Cars Manfs

Intuitive Rule for E. S. Vs. Attribute Make an entity set only if it either: 1. Is more than a name of something; i. e. , it has nonkey attributes or relationships with a number of different entity sets, or 2. Is the “many” in a many-one relationship.

Don't Overuse Weak E. S. • • • There is a tendency to feel that no E. S. has its entities uniquely determined without following some relationships. However, in practice, we almost always create unique ID's to compensate: social-security numbers, VIN's, etc. The only times weak E. S. 's seem necessary are when: a) We can't easily create such ID's; e. g. , no one is going to accept a “species ID” as part of the standard nomenclature (species is a weak E. S. supported by membership in a genus). b) There is no global authority to create them, e. g. , crews and studios.

Notation Summary




Equivalent Schema defined in UML

Design an ER schema for the following enterprise:


University Example • • • A college contains many departments Each department can offer any number of courses Many instructors can work in a department An instructor can work only in one department For each department there is a Head An instructor can be head of only one department Each instructor can take any number of courses A course can be taken by only one instructor A student can enroll for any number of courses Each course can have any number of students

Steps in ER Modeling Step 1: Identify the Entities • DEPARTMENT • STUDENT • COURSE • INSTRUCTOR

Steps in ER Modeling Step 2: Find the relationships • One course is enrolled by multiple students and one student enrolls for multiple courses, hence the cardinality between course and student is Many to Many. • The department offers many courses and each course belongs to only one department, hence the cardinality between department and course is One to Many. • One department has multiple instructors and one instructor belongs to one and only one department , hence the cardinality between department and instructor is one to Many. • Each department there is a “Head of department” and one instructor is “Head of department “, hence the cardinality is one to one. • One course is taught by only one instructor, but the instructor teaches many courses, hence the cardinality between course and instructor is many to one.

Steps in ER Modeling Step 3: Identify the key attributes • Deptname is the key attribute for the Entity “Department”, as it identifies the Department uniquely. • Course# (Course. Id) is the key attribute for “Course” Entity. • Student# (Student Number) is the key attribute for “Student” Entity. • Instructor Name is the key attribute for “Instructor” Entity. Step 4: Identify other relevant attributes • • For the department entity, the relevant attribute is location For course entity, course name, duration, prerequisite For instructor entity, room#, telephone# For student entity, student name, date of birth

Steps in ER Modeling • Step 5: Draw complete E-R diagram with all attributes including Primary Key

E-R Diagram for a University Enterprise

Case Study 2 • Design a DB representing cities, counties, and states in the US: – For states, record the name, population, and state capital (a city). – For counties, record the name, the population, and the located state. – For cities, record the name, the population, the located state and the located county. • Uniqueness assumptions: – Names of states are unique. – Names of counties are unique within a state (e. g. , 26 states have Washington Counties). – Cities are unique only within a state (e. g. , there are 24 Springfields among the 50 states). – Some counties and cities have the same name, even within a state (e. g. , Los Angeles). – All cities are located within a single county 70

Design 1: bad Co. Popu. cities Co. name Popu. Located states name Ci. Popu. Ci. name capita l Problem: County Population is repeated for each city. 71

Design 2: good Co. Popu. Co. name counties Popu. Located name states Belongs-to cities Ci. Popu. capitals Ci. name 72

Case Study 3 • Design a DB consistent with the following facts. – Trains are either local trains or express trains, but never both. – A train has a unique number and an engineer. – Stations are either express stops or local stops, but never both. – A station has a unique name and an address. – All local trains stop at all stations. – Express trains stop only at express stations. – For each train and each station the train stops at, there is a time. 73

Design 1: bad number type time name addr trains engineer Stops. At stations type Problem: does not capture the constraints that express trains only stop only at express stations and local trains stop at all local stations 74

Design 2: good Lname local stations Lnumber Enumber engineer express trains engineer local trains time Stops. At 2 time Stops. At 1 express stations Ename address

Case Study 4 (Pine Valley Furniture Company • 1. The company sells a number of different furniture products. • These products are grouped into several product lines. • The identifier for a product is Product_ID, while the identifier for a product line is Product_Line_ID. • Referring to the customer invoice, we identify the following additional attributes for product: Product_Description, Product_Finish, and Unit_Price. • Another attribute for product line is Product_Line_Name. • A product line may group any number of products, but must group at least one product. • Each product must belong to exactly one product line.

Case Study 4 • 2. Customers submit orders for products The identifier for an order is Order-ID, and another attribute is Order_Date. • A customer may submit any number of orders, but need not submit any orders. • Each order is submitted by exactly one customer. • The identifier for a customer is Customer_ID. • Other attributes include Customer_Name and Customer_Address.

Case Study 4 • 3. A given customer order must request at least one product. • Any product sold by Pine Valley Furniture may not be requested on any order, or may be requested on one or more orders. • An attribute associated with each order and product is Quantity, which is the number of units requested.

Case Study 4 • 4. Pine Valley Furniture has established sales territories for its customers. • Each customer does business in one or more of these sales territories. • The identifier for a sales territory is Territory_ID. • A sales territory may have any number of customers, or may not have any customers doing business.

Case Study 4 • 5. Pine Valley Furniture Company has several salespersons. The identifier for a salesperson is Salesperson_ID. • Other attributes include Salesperson_Name, Salesperson_Telephone, and Salesperson_Fax. • A salesperson serves exactly one sales territory. • Each sales territory is served by one or more salespersons.

Case Study 4 • 6. Each product is assembled from one or more raw materials. • The identifier for the raw material entity is Material_ID. • Other attributes include Unit_of_Measure and Unit_Price. • Each raw material may be assembled into one or more products.

Case Study 4 • 7. Raw materials are supplied by vendors. • The identifier for a vendor is Vendor_ID. • Other attributes include Vendor_Name and Vendor_Address. • Each raw material can be supplied by one or more vendors. • A vendor may supply any number of raw materials, or may not supply any raw materials to Pine Valley Furniture. • An attribute of the relationship between vendor and raw material is Unit_Price

Case Study 4 • 8. Pine Valley Furniture has established a number of work centers. • The identifier for a work center is Work_Center_ID. • Another attribute is Location. Each product is produced in one or more work centers. • A work center may be used to produce any number of products, or may not be used to produce any products.

Case Study 4 • 9. The company has over 100 employees. • The identifier for employee is Employee_ID. • Other attributes are Employee_Name, Employee. Address, and Skill. • An employee may have more than one skill. And Each skill can be mastered by many employees or none.

Case Study 4 • 10: Each employee works in one or more work centers. • A work center must have at least one employee working in that center, but may have any number of employees.

Case Study 4 • 11. Each employee has exactly one supervisor. • An employee who is a supervisor may supervise any number of employees, but not all employees are supervisors.

Library Case Study • When a library first receives a book from a publisher it is sent, together with the accompanying delivery note, to the library desk. Here the delivery note is checked against a file of books ordered. • If no order can be found to match the note, a letter of enquiry is sent to the publishers. If a matching order is found, a catalogue note is prepared from the details on the validated delivery note. • The catalogue note, together with the book, is sent to the registration department. The validated delivery note is sent to the accounts department where it is stored. • On receipt of an invoice from the publisher, the accounts department checks its store of delivery notes. If the corresponding delivery note is found then an instruction to pay the publishers is made, and subsequently a cheque is sent. If no corresponding delivery note is found, the invoice

Conference centre booking system A conference centre takes bookings from clients who wish to hold courses or conferences at the centre. When clients make bookings they specify how many people are included in the booking, and of these, how many will be resident during the booking, and how many will require catered or non-catered accommodation at the centre. The centre contains a number of facilities which may be required by clients making bookings as follows: A. There are 400 bedrooms for clients who will be resident during the Course or conference. B. A maximum of 250 catered people can be handled at any one time. C. Six main lecture theatres providing seating for 200 people. D. Twenty seminar rooms each able to accommodate 25 people. E. Video conference facilities. The video conference facilities consist of four separate video conference networks. Each video conference network has a large screen based in one of the main lecture theatres, along with 3 satellite screens each of which is based

Shipping company example • The London and Ireland Shipping Company PLC (LISC) was founded in 1852 and owns a fleet of cargo ships. The company had historically run passenger liners, but recent policy decisions involved the sale of all passenger-carrying vessels. The company currently has 14 vessels, including one oil tanker and one tugboat operating out of Liverpool. Most of the vessels are registered in Liberia for tax reasons. • Each ship has one or more holds divided into spaces. The holds are defined by steel bulkheads and the spaces are defined by shelf racks or other physical dividers. Sister ships, built by the same shipbuilders and to the same designs have similar names, such as Pride of Ireland, Queen of Ireland, Song of Ireland Warrior of Ireland. Sister ships also have identical cargo storage facilities. • LISC issues contracts to agents for one or more manifests (lists of cargo items to be shipped). LISC's charges for cargo carried are based on the number of spaces the cargo requires for storage. The types of cargo typically carried by LISC include grain, coal and ores (carried only in ships equipped with bulk cargo holds). They also transport sacked grain, heavy cases, containers (which may be carried on deck), pallets and so on.

Shipping company example • Cargo items may take up less than one space in a hold, or one or more spaces, depending on the size of the item. A space may therefore contain several small cargo items. • The ships owned by LISC are kept as busy and as full as possible, in order to maximise the profits that each vessel makes and minimise running & operating costs. LISC's ships ply most of the seas of the world, but tend to operate mainly in the Mediterranean, the North and Mid Atlantic and the Indian Ocean. Different ships require different crew complements. • LISC intends to create a computer based information system that will be able to perform the following tasks: • • record the voyages of each ship with the start and end ports. • • record the cargo held by a ship on each voyage • • keep records of their employees and the ships they are assigned to • • producing invoices for agents and customers • • keep a record of customers' payments on invoices • • analyse the efficiency of use of cargo space and of percentage wasted cargo space for ships voyages

Film Club Case Study: Film Club UK is a company that owns or leases a number of small cinemas in the UK. They have commissioned a database designer to design a database solution to enable them to maintain details about their cinemas and the films that they show. Note that it is possible to have two cinemas in the same location with the same name (there used to be two Odeons in Newcastle). It is also possible to have different films with the same title (for example, different versions of a Shakespeare play). Films are scheduled for one or more showings at a cinema within a ‘season’. Season details are to be notified in advance of the dates and times of showings, takings, etc. to be notified later. Any one film may have more than one season at any one
- Slides: 91