Lecture 2 th October 5 2000 Conceptual Modeling

Lecture #2 th October 5 , 2000 Conceptual Modeling • Administration: – HW 1 available – Details on projects – Exam date – XML comment

Building an Application with a Database System • Requirements modeling (conceptual, pictures) – Decide what entities should be part of the application and how they should be linked. • Schema design and implementation – Decide on a set of tables, attributes. – Define the tables in the database system. – Populate database (insert tuples). • Write application programs using the DBMS – way easier now that the data management is taken care of.

Database Design • Why do we need it? – Agree on structure of the database before deciding on a particular implementation. • Consider issues such as: – What entities to model – How entities are related – What constraints exist in the domain – How to achieve good designs

Database Design Formalisms • Object Definition Language (ODL): – Closer in spirit to object-oriented models • Entity/Relationship model (E/R): – More relational in nature. • Both can be translated (semi-automatically) to relational schemas (with varying amount of pain). • ODL to OO-schema: direct transformation (C++ or Smalltalk based system).

Outline • • • ODL (rather briefly) E/R diagrams Some high-level design principles Modeling constraints Introduction to the relational model From E/R & ODL to relations

Object Definition Language • Is part of ODMG, which also gave us OQL. • Resembles C++ (and Smalltalk). • Basic design paradigm in ODL: – Model objects and their properties. • For abstraction purposes: – Group objects into classes. • What qualifies as a good class? – Objects should have common properties.

ODL Class Declarations Interface <name> { attributes: <type> <name>; relationships <range type> <name>; methods } Method example: float gpa(in: Student) raises (no. Grades) Arbitrary function can compute the value of gpa, based on a student object given as input.

ODL Example category price name Product Company Person name address ssn stockprice

ODL Declarations Interface Product { attribute string name; attribute float price; attribute enum Categories {electronics, communications, sports …} category } Interface Company { attribute string name; attribute float stockprice; } Interface Person { attribute integer ssn; attribute string name; attribute Struct Address {string street, string city} address; }

ODL Example Extended category price name Product made. By buys name Company Person works. For name address ssn stockprice

ODL Declarations, Extended Interface Product { attribute string name; attribute float price; attribute enum Categories {electronics, communications, sports …} category; relationship <Company> made. By; } Interface Person { attribute integer ssn; attribute string name; attribute Struct Address {string street, string city} address; relationship set <Product> buys; relationship set <Company> works. For; }

ODL Example, Extended Again category price name Product buys name makes employs Person made. By Company works. For name address ssn stockprice

ODL Declarations, Extended Again Interface Company { attribute string name; attribute float stockprice; relationship set <Product> makes inverse Product: : made. By; relationship set <Person> employs inverse Person: : works. For; }

Types in ODL Basic types: Atomic types (e. g. , string, integer, …) Interface types (e. g. , Person, Product, Company) Constructors: Set: (1, 5, 6) Bag: (1, 1, 5, 6, 6 ) List: (1, 5, 6, 1, 6 ) Array: Integer[17] Struct: {string street, string city, integer zipcode}

Allowable Types in ODL For attributes: start with atomic or struct, and apply a collection type. OK: string, set of integer, bag of Address. Not OK: Product, set of integer. For relationships: start with interface type and apply a collection type. OK: Product, set of Product, list of Person. Not OK: struct {pname Product, cname Company} set of bag of Product integer

Entity / Relationship Diagrams Objects Classes entities entity sets Attributes are like in ODL. Relationships: like in ODL except Product address buys - not associated with classes (I. e. , first class citizens) - not necessarily binary

name category name price makes Company Product stockprice buys employs Person address name ssn

Multi-way Relationships How do we model a purchase relationship between buyers, products and stores? Product Purchase Person Store

Roles in Relationships What if we need an entity set twice in one relationship? Product Purchase buyer salesperson Person Store

Roles in Relationships Product Note the multiplicity of the relationships: we cannot express all possibilities Purchase buyer salesperson Person Store

Attributes on Relationships date Product Purchase Person Store

Design Principles What’s wrong? Product Country Moral: be faithful! Purchase President Person

What’s Wrong? date Product Purchase Store Moral: don’t talk too much. weather person

What’s Wrong? date Dates Product Purchase Moral: don’t complicate life more than it already is. Person Store

Do we really need 3 -way relationships? Product. Of Product Store. Of Store Buyer. Of Person Purchase Moral: Find a nice way to say things.

Modeling Subclasses The world is not flat! Some objects in a class may have properties not shared by other members: Products Software products Educational products So --- we define subclasses (in ODL and in E/R).

Subclasses in ODL Interface Software. Product: Product{ attribute Set<string> platform; attribute Set<integer> required. Memory; } Interface Educational. Product: Product{ attribute Struct Interval {integer begin, integer end} age. Group; attribute string topic } The two classes also inherit all the properties of Product.

Subclasses in E/R Diagrams name category price Product isa Software Product platforms isa Educational Product Age Group

Multiple Inheritance Platforms required memory Product age. Group topic Software Product Educational Product Educ-software Product Educational-method

How do we resolve conflicts? Platforms required memory Product age. Group topic Software Product Rating (ASA) Rating? Rating (ATA) Educational Product Educ-software Product Educational-method

In ODL: Every object belongs to a single class name category price In E/R: An entity may be spread out in multiple sets. Product isa Software Product platforms isa Educational Product Age Group

Modeling Constraints Extracting constraints is what modeling is all about. But how do we express them? Examples: Keys: social security number uniquely identifies a person. Single-value constraints: a person can have only one father. Referential integrity constraints: if you work for a company, it must exist in the database. Domain constraints: peoples’ ages are between 0 and 150. Why are these constraints useful in the implementation?

Keys A set of attributes that uniquely identify an object or entity: Person: social security number name + address + age Perfect keys are often hard to find, so organizations usually invent something. An object may have multiple keys: employee number, social-security number

Keys in ODL Interface Person (key ssn) { properties… } Defining multiple keys: (key ssn employe. ID (name address age))

Keys in E/R Diagrams name category price No formal way to specify multiple keys in E/R diagrams Product Person address name ssn

Single Value Constraints An entity (or object) may have at most one value for a given attribute or relationship. Person: name, social-security number Company: stock price How do we do this in ODL? In E/R, every attribute has at most one value. Arrows tell us about multiplicity of relations. If we have a single-valued constraint, we can either: 1. Require that the value exist (see referential integrity shortly) 2. Allow null values.

Referential Integrity Constraints A relationship has one value and the value must exist. Example: Product made. By Company: company must exist. How do we enforce referential integrity constraints? (otherwise, we get dangling pointers) - forbid to delete a reference object, or - delete the objects that reference an object we’re deleting. In E/R diagrams: Product makes Company

Weak Entity Sets Entity sets are weak when their key attributes come from other classes to which they are related. This happens if: - part-of hierarchies - splitting n-ary relations to binary. affiliation Team sport number University name

The Relational Data Model Database Model (ODL, E/R) ODL definitions Diagrams (E/R) Relational Schema Tables: row names: attributes rows: tuples Physical storage Complex file organization and index structures.

Terminology Attribute names tuples Name Price Category Manufacturer gizmo $19. 99 gadgets Gizmo. Works Power gizmo $29. 99 gadgets Gizmo. Works Single. Touch $149. 99 photography Canon Multi. Touch $203. 99 household Hitachi What can’t you say in the relational model?

More Terminology Every attribute has an atomic type. Relation Schema: relation name + attribute names + attribute types Relation instance: a set of tuples. Only one copy of any tuple! Database Schema: a set of relation schemas. Database instance: a relation instance for every relation in the schema.

More on Tuples Formally, a mapping from attribute names to (correctly typed) values: name price category manufacturer gizmo $19. 99 gadgets Gizmo. Works Sometimes we refer to a tuple by itself: (note order of attributes) (gizmo, $19. 99, gadgets, Gizmo. Works) or Product (gizmo, $19. 99, gadgets, Gizmo. Works).

Updates The database maintains a current database state. Updates to the data: 1) add a tuple 2) delete a tuple 3) modify an attribute in a tuple Updates to the data happen very frequently. Updates to the schema: relatively rare. Rather painful. Why?

From ODL to Relational Schema Start simple: a class definition has only single valued attributes Interface product{ float price; string name; Enum {telephony, gadgets, books} category} Class becomes a relation, and every attribute becomes a relation attribute: Product Name Price Category Gizmo $19. 99 gadgets

Adding Non atomic Attributes Price is a record: {string currency, float amount} Product Name Currency Amount Category Gizmo US$ 19. 99 gadgets Power Gizmo US$ 29. 99 gadgets

Set Attributes One option: have a tuple for every value in the set: Name Fred Joe Disadvantages? SSN 123 -321 -99 909 -438 -44 Phone Number (201) (206) (908) (212) 555 -1234 572 -4312 464 -0028 555 -4000

Modeling Collection Types The problem becomes even more significant if a class has several attributes that are set types? Question: how bad is the redundancy for n set type attributes, each with possibly up to m values? Questions: How can we model bags? Lists? Fixed length arrays?

Modeling Relationships Interface Product { attribute string name; attribute float price; relationship <Company> made. By; } Interface Company { attribute string name; attribute float stock-price; attribute string address; } How do we incorporate the relationship made. By into the schema?

Option #1 Name Price made-by-name made-by-stock-price made-by-address Gizmo $19. 99 gizmo. Works What’s wrong? 0. 0001$ Montezuma

Hint Interface Product { attribute string name; attribute float price; relationship <Company> made. By; } Interface Company { attribute string name; attribute float stock-price; attribute string address; relationship set <Product> makes; }

Better Solution Product relation: Name Gizmo (assume: name is a key for company) Price $19. 99 made-by-name gizmo. Works Company relation: Name Stock Price Address Gizmo $0. 00001 Montezuma

Additional Issues 1. What if there is no key? 2. What if the relationship is multi-valued? 3. How do we represent a relationship and its inverse?

From E/R Diagrams to Relational Schema Easier than ODL - relationships are already independent entities - only atomic types exist in the E/R model. Entity sets relations Relationships relations Special care for weak entity sets.

name category name price makes Company Product Stock price buys employs Person address name ssn

Entity Sets to Relations name category price Product: Name Category gizmo gadgets Price $19. 99

Relationships to Relations name Start Year category makes name Company Product Stock price Relation Makes (watch out for attribute name conflicts) Product-name gizmo Product-Category Company-name gadgets gizmo. Works Starting-year 1963

Handling Weak Entity Sets affiliation Team sport number University name Relation Team: Sport mud wrestling Number 15 University-name Montezuma State U. - need all the attributes that contribute to the key of Team - don’t need a separate relation for Affiliation.

Modeling Subclass Structure Platforms required memory Product age. Group topic Software Product Educational Product Educ-software Product Educational-method

Option #1: the ODL Approach 4 tables: each object can only belong to a single class Product(name, price, category, manufacturer) Educational. Product( name, price, category, manufacturer, age. Group, topic) Software. Product( name, price, category, manufacturer, platforms, required. Memory) Educational. Software. Product( name, price, category, manufacturer, age. Group, topic, platforms, required. Memory)

Option #2: the E/R Approach Product(name, price, category, manufacturer) Educational. Product( name, age. Group, topic) Software. Product( name, platforms, required. Memory) No need for a relation Educational. Software. Product Unless, it has a specialized attribute: Educational. Software. Product(name, educational-method)

Option #3: The Null Value Approach Have one table: Product ( name, price, manufacturer, age-group, topic, platforms, required-memory, educational-method) Some values in the table will be NULL, meaning that the attribute not make sense for the specific product. How many more meanings will NULL have? ?
- Slides: 61