Unit 1 Introduction to DBMS Database Management Systems
Unit 1 Introduction to DBMS (Database Management Systems) Application program DBMS End-user 1 -1
Outline of Unit 1 1. 1 Information Systems 1. 2 An Overview of a Database System 1. 3 An Architecture for a Database System 1. 4 Database Technology Trends Wei-Pang Yang, Information Management, NDHU 2
1. 1 Information Systems 1 -3
Stages of Information System § Stage 0: Manual Information System • Records • Files • Index Cards § Stage 1: Sequential Information Systems • Tapes • Files • slow, non-interactive, redundancy, . . § Stage 2: File Based Information Systems • Disk (direct access) • application program has its own file • data redundancy data dependence § Stage 3: DBMS based Information Systems • Generalized data management software • Transaction processing Wei-Pang Yang, Information Management, NDHU 4
File Based Information Systems § Conventional Data Processing techniques: Enterprise: Application System A Application System B Application Program A Application Program B File System A Customer File System B Invoice File System N Inventory … Application System N Wei-Pang Yang, Information Management, NDHU Application Program N 5
File Based Information Systems (cont. ) Customer No. Customer Name Customer Addr. Social Security ID Inventory Part No. Part Description Unit Price Supplier Quantities Remain Quantities Ordered Wei-Pang Yang, Information Management, NDHU Invoice Customer No. Customer Name Part No. Quantities Unit Price Parts Part No. Part Description Supplier Quantities Ordered Customer Name Unit Price 6
User A 1 User A 2 User B 1 User B 2 User B 3 Host Language + DSL Host Language + DSL C, Pascal DSL (Data Sub Language) e. g. SQL 1 2 3 External View @ # & External schema A External/conceptual mapping A Conceptual schema < External schema B External View B External/conceptual mapping B Conceptual View Database management system dictionary (DBMS) e. g. system catalog Conceptual/internal mapping DBA (Build and maintain schemas and mappings) Storage structure definition (Internal schema) Wei-Pang Yang, Information Management, NDHU 1 2 Stored database (Internal View) 3 # . . . 100 & @ 7
DBMS based Information Systems: Basic Approach - Integration § (1) Integration of Information • Description of the integrated view of data is the "Conceptual Schema" of the database DBMS Application program Wei-Pang Yang, Information Management, NDHU End-user 8
DBMS based Information Systems: Basic Approach – Simple views and High level language § (2) Provide simple views (External Schema) and high level language (e. g. SQL) for users to manipulate (handle) data • High level language: e. g. SQL (Structured Query Language) <e. g. >: SELECT SNAME FROM S WHERE S#= 'S 4'; • Description of user's view of data is the "external schema" or "subschema" or "view". • High-level languages (Query Language): SQL (1) Data Definition Language: define format (2) Data Manipulation Language: retrieve, insert, delete, update • Emphasize: EASE OF USE !! Wei-Pang Yang, Information Management, NDHU S S# name 9
DBMS based Information Systems: Basic Approach - Storage/Access Method § (3) Efficient Storage/Access Techniques: • implemented once rather than duplicated in all application programs. User: query in SQL Language Processor DBMS Access Methods Calls Access Method (B+ tree, Dynamic Hashing) I/O calls Wei-Pang Yang, Information Management, NDHU 10
DBMS based Information Systems: Basic Approach - Transaction Management § (4) Provide Transaction Management: • Concurrency Control • Recovery • Security • . : Wei-Pang Yang, Information Management, NDHU 11
Example: A Simple Query Processing Query in SQL: SELECT CUSTOMER. NAME FROM CUSTOMER, INVOICE WHERE REGION = 'N. Y. ' AND AMOUNT > 10000 AND CUTOMER. C#=INVOICE. C DBMS Language Processor Internal Form : ( (S SP) Optimizer Operator : SCAN C using region index, create C SCAN I using amount index, create I SORT C? and I? on C# JOIN C? and I? on C# EXTRACT name field Language Processor Operator Processor Calls to Access Method: OPEN SCAN on C with region index GET next tuple. . . Calls to file system: GET 10 th to 25 th bytes from block #6 of file #5 Access Method e. g. B-tree; Index; Hashing Access Method File System 12 Wei-Pang Yang, Information Management, NDHU database
1. 2 An Overview of a Database System 1 -13
An Example q The Wine Cellar Database: Cellar: Bin 2 3 6 12 16 43 50 51 q Wine Chardonnay Jo. Riesling Cab. Sauvignon Pinot Noir Producer Year Bottle Ready Buena Vista 83 1 85 Louis Martini 81 5 84 Chappellet 82 4 85 Buena Vista 82 1 83 Sattui 82 1 83 Robt. Mondavi 77 12 87 Mirassou 77 3 85 Ch. St. Jean 78 2 86 Comments Thanksgiving Late Harvest Very dry Harvest Retrieval: • DML (Data Manipulation Language): SELECT Wine, Bin, Producer FROM Cellar WHERE Ready = 85; • Result: Wine Bin Chardonnay 2 Chardonnay 6 Pinot Noir 50 Wei-Pang Yang, Information Management, NDHU Producer Buena Vista Chappellet Mirassou 14
An Example (cont. ) q Deletion: • DML: DELETE FROM Cellar WHERE Ready < 86; • Result: Bin Wine 43 Cab. Sauvignon 51 Pinot Noir q Producer Robt. Mondavi Ch. St. Jean Year Bottle Ready Comments 77 78 12 2 87 86 Insertion: • DML: INSERT INTO Cellar VALUES (53, 'Pinot Noir', 'Franciscan', 79, 1, 86, 'for Joan'); • Result: Bin Wine 43 Cab. Sauvignon 51 Pinot Noir 53 Pinot Noir Wei-Pang Yang, Information Management, NDHU Producer Robt. Mondavi Ch. St. Jean Franciscan Year Bottle Ready 77 78 79 12 2 1 87 86 86 Comments for Joan 15
An Example (cont. ) q Update • DML: UPDATE Cellar SET Bottles = 4 WHERE Bin = 51; • Result: Bin Wine 43 Cab. Sauvignon 51 Pinot Noir 53 Pinot Noir Wei-Pang Yang, Information Management, NDHU Producer Robt. Mondavi Ch. St. Jean Franciscan Year 77 78 79 Bottle Ready Comments 12 4 1 87 86 86 for Joan 16
What is a Database System? DBMS Application program End-user Wei-Pang Yang, Information Management, NDHU 17
What is a Database System? (cont. ) q Major components of a database system: • • Data: integrated and shared. Hardware: disk, CPU, Main Memory, . . . Software: DBMS Users: 1. Application programmers 2. End users 3. Database administrator (DBA) • Defining external schema • Defining conceptual schema • Defining internal schema • Liaison with users • Defining security and integrity checks • Defining backup and recovery procedures • Monitoring performance and changing requirements Wei-Pang Yang, Information Management, NDHU 18
Why Database ? q Redundancy can be reduced q Inconsistency can be avoided q The data can be shared q Standards can be enforced q Security restrictions can be applied q Integrity can be maintained q Provision of data independence objective ! Wei-Pang Yang, Information Management, NDHU 19
Data Independence q q Application Program Data Structure Immunity of application to change in storage structure and access strategy. Wei-Pang Yang, Information Management, NDHU 20
Data Dependence vs. Data Independence q S Data Dependent S# e. g. SELECT CITY S 1 S 2 : Sn FROM S WHERE ITEM = 'X'; • Linked list: S Top TOP s 1 sn s 2 if item = TOP��. item then. . • Tree: . if item < root. data then root : = root�. left. . • Array: if A[I] = item then. . . • Storage structure changed program changed Wei-Pang Yang, Information Management, NDHU 21
1. 3 An Architecture for a Database System 1 -22
User A 1 User A 2 User B 1 User B 2 User B 3 Host Language + DSL Host Language + DSL (Data Sub Language) e. g. SQL, QUEL External View @ #A & External schema A External/conceptual mapping A Conceptual schema < External View B External/conceptual mapping B Conceptual @ View#A & Database management system (DBMS) Conceptual/internal mapping DBA (Build and maintain schemas and mappings) External schema B Storage structure definition (Internal schema) Stored database (Internal View) Wei-Pang Yang, Information Management, NDHU # & @ 23
An Example of the Three Levels q Internal level: STORED_EMP PREFIX EMP# DEPT# PAY q length = 18 TYPE = BYTE(6), OFFSET = 0, INDEX = EMPX TYPE = BYTE(6), OFFSET=0, TYPE = BYTE(4), OFFSET = 12 TYPE = FULLWORD, OFFSET = 16 Conceptual level: EMPLOYEE_NUMBER DEPARTMENT_NUMBER SALARY q CHARACTER(6) CHARACTER(4) NUMERIC(5) External level: • ( PL /I ) DCL 1 EMP 2 EMP# 2 SAL CHAR(6) FIXED BIN(31) • (COBOL) 01 EMPC 02 EMPNO PIC X(4) 02 DEPTNO PIC X(4) Wei-Pang Yang, Information Management, NDHU 24
Functions of the DBMS q Data Definition Language (DDL) q Data Manipulation Language (DML) q Data Security and Integrity q Data Recovery and Concurrency q Data Dictionary q Performance Wei-Pang Yang, Information Management, NDHU 25
1. 4 Database Technology Trends 1960 s to Mid-1970 s to Mid-1980 s Late 1980 s Future Merging data models, knowledge representation, and programming languages Network Hierarchical Relational Semantic Object-oriented Logic Database Hardware Mainframes Minis PCs Faster PCs Workstations Database machines User Interface None Forms Query languages - SQL, QUEL Graphics Menus Query-by-forms Natural language Speech input Program Interface Procedural 4 GL Logic programming Integrated database and programming language Presentation and display processing Reports Processing data Business graphics Image output Knowledge processing Generalized display managers Distributed knowledge processing Data Model Wei-Pang Yang, Information Management, NDHU Embedded query language Report generators Information and transaction processing Parallel processing Optical memories 26
Distributed Databases Database q Distributed System Distributed database is a database that is not stored in its entirety at a single physical location, but rather is spread across a network of computer. < e. g. > Hsinchu computers Taipei Kaoshiung communication links Taichung Wei-Pang Yang, Information Management, NDHU 27
Distributed Databases (cont. ) q q q Advantages: § efficiency of local processing § data sharing Disadvantages: § communication overhead § implementation difficulties Reference: S. Ceri and G. Pelagatti "Distributed Databases: principles and systems" Wei-Pang Yang, Information Management, NDHU 28
Multi-Database/Heterogeneous Database Multidatabase IMS Hierarchical Model • semantic inconsistency INGRES Relational Model . . . ORION Object. Oriented Model • data incompleteness • global schema Wei-Pang Yang, Information Management, NDHU 29
DB + AI Query Database AI DBMS Language Processor Query Optimizer Knowledge Base Operator Processor Access Method File Manager Distributed DB design Knowledge Base Wei-Pang Yang, Information Management, NDHU Logical DB design Knowledge Base 30
KBMS Database Logic § A Combined Model : Logic Programming + Relational DB § Three layers : User Program Knowledge management program IDB relational interface Query : ? : - ancestor (taro, Y) ? : - grandfather (? , c) IDB: ancestor(X, Y): - parent(X, Y) ancestor(X, Y) : - parent(X, Z), ancestor(Z, Y) parent(X, Y): -edb(father(X, Y)) parent(X, Y): -edb(mother(X, Y)) grandfather(X, Z): - father(X, Y) ^ father(Y, Z) father Relational DB management program EDB Wei-Pang Yang, Information Management, NDHU EDB: mother father son A B X Y. . . B . C 31
Database Object-Oriented OODB q A typical Document : MEMO [Woelk 86, SIGMOD] MCC To: W. Kim From: D. Woolk Date: September 18, 1992 Subject: Workstations can be heard In the computer center of National Chiao-Tung University, there a lot of workstations. There are HP RS serials, SUNs, Apollo, and so on. The students in NCTU learn to use workstation since they are freshmen. The configuration of the workstations follows: Workstation . �speaker voice message associated . . Workstation } text } } graphics Database Server In the course �introduction to Computer Science? students do their homework's on workstations. image Wei-Pang Yang, Information Management, NDHU 32
Use of a Database Management System in Design and Application Database Manufacturing Design Release Control Detailed Design Fabrication Assembly Info Analysis Models Preliminarydesign INTERFACE DBMS Database Management System Graphic Interface Test / Inspection Language Interface APPLICATION Design Analysis Design Verification Manufacturing Planning Wei-Pang Yang, Information Management, NDHU Evaluation Production Control Synthesis Release Project Management 33
Fuzzy Database Fuzzy Set Theory § Fuzzy Query <e. g. > SELECT FROM WHERE STUDENT. NAME STUDENT SEX = M� AND HEIGH = TALLER AND WEIGH = SLIMMER STUDENT: NAME SEX HEIGHT WEIGHT IQ Mary Linda. . . F F. . . <e. g. > Wei-Pang Yang, Information Management, NDHU 158 165. . . SELECT FROM WHERE 55 55. . . High Medium. . . STUDENT. NAME STUDENT IQ >= 130 34
More? AI DB 1993 VR DB ? DB 1997 1999 2001 ? 1995 DB ? 2004 ? ? ? SARS ? DB Mobile/video ? DNA/Bio. Info. DB WWW DB 911/ Anthrax ? 2003 ? Wei-Pang Yang, Information Management, NDHU DB 2006 ? ? 35
- Slides: 35