ITEC 3220 A Using and Designing Database Systems
ITEC 3220 A Using and Designing Database Systems Instructor: Gordon Turpin Course Website: http: //cse. yorku. ca/~gordon/itec 322 0 S 07 Office: CSEB 3020
Course Objective • Examine databases, trends in database management systems and their application in a wide range of organizational areas • Provide an overview of database processing, both historical and discussion of recent trends in database management • Provide the student with exposure to a range of tools, including a relational DBMS as well as an object-oriented DBMS 2
Textbooks • A bundle consisting of – Database Systems Design, Implementation, & Management, Seventh Edition - Peter Rob & Carlos Coronel – A Guide to SQL, Seventh Edition Philip J. Pratt 3
Marking Scheme • Final exam (closed book) - 50% Midterm (closed book) - 35% Assignments (3 assignments) - 15% • Lecture notes will be made available at through the course website. 4
Tentative Schedule • Week 1 Database concepts and the relational database model (Chapter 1, 2 & 3) • Week 2 Entity relationship model (Chapter 4) • Week 3 Normalization (Chapter 5) • Week 4 SQL (Chapter 7 & A guide to SQL) • Week 5 SQL + lab (Chapter 7 & A guide to SQL) • Week 6 Advanced SQL + lab (Chapter 8 & A guide to SQL) 5
Schedule (Cont’d) • Week 7 Midterm • Week 8 Database design & case study (Chapter 9) • Week 9 Transaction management and concurrent control (Chapter 10) • Week 10 Transaction management and concurrent control (Cont’d) and data warehousing (chapter 10 &13) • Week 11 Objected-Oriented database (Appendix G) • Week 12 TBA and review for final exam 6
Introduction Database Systems and Data Models
Basic Definition • Data: raw facts – Constitute building blocks of information • Information: is produced by processing data and reveals meaning of data – Good, timely, relevant information key to decision making – Good decision making key to organizational survival • Database: shared, integrated computer structure housing: – End user data – Metadata 8
An Example • Converting data to information 9
An Example (Cont’d) • Metadata 10
What is a Database Management System (DBMS) • A collection of programs that manages the database structure and controls access to the data stored in the database – Possible to share data among multiple applications or users • Example: bank and its ATM machines – Makes data management more efficient and effective • End users have better access to more and better-managed data 11
DBMS Manages Interaction 12
File and File System • Terminology – Data • Raw Facts – Field • Group of characters with specific meaning – Record • Logically connected fields that describe a person, place, or thing – File • Collection of related records 13
Example 14
Disadvantages of File Processing • Data Dependence – Change in file’s data characteristics requires modification of data access programs – Lengthy development time – Excessive program maintenance • Structural Dependence – Change in file structure requires modification of related programs 15
Example 16
Disadvantages of File Processing (Cont’d) • Data Redundancy – – Different and conflicting versions of same data Results of uncontrolled data redundancy • Data anomalies – Modification – Insertion – Deletion • Data inconsistency – Lack of data integrity 17
Solution: Database Approach • Database consists of logically related data stored in a single repository • Advantages of database approach – Structural and data independence – Minimal data redundancy • Reduces inconsistency, data anomalies – Improves data sharing and data quality – Stores data structures, relationships, and access paths 18
Database vs. File Systems 19
Database System Environment • Hardware: all the system's physical devices • Software – Operating system software – DBMS software – Application programs and utility software • People • Procedures • Data 20
Database Models • Collection of logical constructs used to represent data structure and relationships within the database – Conceptual models: logical nature of data representation – Implementation models: emphasis on how the data are represented in the database 21
Database Models: Historic Overview • • Flat files - 1960 s - 1980 s Hierarchical – 1970 s - 1990 s Network – 1970 s - 1990 s Relational – 1980 s - present Object-oriented – 1990 s - present Object-relational – 1990 s - present Data warehousing – 1980 s - present Web-enabled – 1990 s - present 22
Hierarchical Database Model • Logically represented by an upside down tree – Each parent can have many children – Each child has only one parent 23
Hierarchical Database Model (Cont’d) • Advantages – – Conceptual simplicity Database security and integrity Data independence Efficiency • Disadvantages – – – Complex implementation Difficult to manage and lack of standards Lacks structural independence Application programming and use complexity Implementation limitations 24
Network Database Model • Each record can have multiple parents – Composed of sets – Each set has owner record and member record – Member may have several owners 25
Network Database Model (Cont’d) • Advantages – – – Conceptual simplicity Handles more relationship types Data access flexibility Promotes database integrity Data independence Conformance to standards • Disadvantages – System complexity – Lack of structural independence 26
Relational Database Model • Perceived by user as a collection of tables for data storage • Tables are a series of row/column intersections • Tables related by sharing common entity characteristic(s) 27
Relational Database Model (Cont’d) 28
Relational Database Model (Cont’d) • Schema for the table – Graphical representation AGENT _CODE AGENT_ LNAME FNAME AGENT_ INITIAL AGENT_A REACODE AGENT_ PHONE – Text description AGENT(AGENT_CODE, AGENT_LNAME, AGENT_FNAME, AGENT_INITIAL, AGENT_AREACODE, AGETN_PHONE) 29
Relational Database Model (Cont’d) • Advantages – Structural independence – Improved conceptual simplicity – Easier database design, implementation, management, and use – Ad hoc query capability with SQL – Powerful database management system • Disadvantages – Substantial hardware and system software overhead – Poor design and implementation is made easy – May promote “islands of information” problems 30
Object-Oriented Database Model • Objects or abstractions of real-world entities are stored – Attributes describe properties – Collection of similar objects is a class • Methods represent real world actions of classes • Classes are organized in a class hierarchy – Inheritance is ability of object to inherit attributes and methods of classes above it 31
OO Data Model • Advantages – Adds semantic content – Visual presentation includes semantic content – Database integrity – Both structural and data independence • Disadvantages – – Lack of OODM Complex navigational data access Steep learning curve High system overhead slows transactions 32
Costs and Risks of the Database Approach • Up-front costs: – Installation Management Cost and Complexity – Conversion Costs • Ongoing Costs – Requires New, Specialized Personnel – Need for Explicit Backup and Recovery • Organizational Conflict 33
Review • Basic concepts: data, information, database, DBMS, file, conceptual model, implementation model, etc • Why database and its importance, cost and risk • Different database models – definition – advantage – disadvantage 34
Chapter 3 The Relational Database Model
In this chapter, you will learn: • Basic components of the relational database model – Entities and their attributes – Relationships among entities • Relational algebra • Relationship in relational database • Data redundancy 36
Basic Definition • Entities and Attributes – Entity is a person, place, event, or thing about which data is collected – Attributes are characteristics of the entity • Tables – Holds related entities or entity set – Also called relations – Comprised of rows and columns 37
Table Characteristics • • • Two-dimensional structure with rows and columns Rows (tuples) represent single entity Columns represent attributes Row/column intersection represents single value Tables must have an attribute to uniquely identify each row • Column values all have same data format • Each column has range of values called attribute domain • Order of the rows and columns is immaterial to the DBMS 38
Example Tables 39
Terminology for Relational Database Table. Oriented Set-oriented Record. Oriented Table Relation Record type Row Tuple Record Column Attribute Field 40
Key • Consists of one or more attributes that determine other attributes • Primary key (PK) is an attribute (or a combination of attributes) that uniquely identifies any given entity (row) • Key’s role is based on determination – If you know the value of attribute A, you can look up (determine) the value of attribute B 41
Keys (Cont’d) • Composite key – Composed of more than one attribute • Key attribute – Any attribute that is part of a key • Superkey – Any key that uniquely identifies each entity • Candidate key – A superkey without redundancies 42
Keys (Cont’d) • Foreign key (FK) – An attribute whose values match primary key values in the related table • Referential integrity – FK contains a value that refers to an existing valid tuple (row) in another relation • Secondary key – Key used strictly for data retrieval purposes 43
Simple Relational Database 44
Controlled Redundancy • Makes the relational database work • Tables within the database share common attributes that enable us to link tables together • Multiple occurrences of values in a table are not redundant when they are required to make the relationship work • Redundancy is unnecessary duplication of data 45
Integrity Rules 46
Integrity Rules (cont’d) 47
Exercises Table name: TRUCK Table name: BASE Table name: TYPE 48
Exercises (Cont’d) • For each table, identify the primary key and the foreign keys. • Do the tables exhibit entity integrity? Explain • So the tables exhibit referential integrity? Explain • Identify the TRUCK table’s candidate key (s). • For each table, identify a super key and a secondary key 49
- Slides: 49