Chapter 1 Introduction to DBMS Concepts IAM 6133
Chapter 1 Introduction to DBMS Concepts IAM 6133 Advanced Database Technology
Topic 1 Introduction to DBMS Concepts (Database Management Systems) Application program DBMS End-user 1 -2
Outline of Unit 1 1. 1 Information Systems 1. 2 An Overview of a Database System 1. 3 An Architecture for a Database System 1. 4 Database Technology Trends 1 -3
1. 1 Information Systems 1 -4
Stages of Information System – Stage 0: Manual Information System • Records • Files • Index Cards – Stage 1: Sequential Information Systems • Tapes • Files • slow, non-interactive, redundancy, . . – Stage 2: File Based Information Systems • Disk (direct access) • application program has its own file data dependence • data redundancy – Stage 3: DBMS based Information Systems • Generalized data management software • Transaction processing 1 -5
File Based Information Systems – Conventional Data Processing techniques: Enterprise: Application System A Application System B Application Program A File System A Application Program B Customer File System B Invoice File System N Inventory … Application System N Application Program N 1 -6
File Based Information Systems (cont. ) Customer Invoice Customer No. Customer Name Part No. Quantities Unit Price Customer No. Customer Name Customer Addr. Social Security ID Inventory Parts Part No. Part Description Unit Price Supplier Quantities Remain Quantities Ordered Part No. Part Description Supplier Quantities Ordered Customer Name Unit Price 1 -7
User A 1 User A 2 User B 1 User B 2 User B 3 Host Language + DSL Host Language + DSL C, Pascal DSL (Data Sub Language) e. g. SQL 1 3 2 External View @ # & External schema A External/conceptual mapping A Conceptual schema < External schema B External View B External/conceptual mapping B Database management system dictionary (DBMS) e. g. system Conceptual View catalog Conceptual/internal mapping DBA (Build and maintain schemas and mappings) Storage structure definition (Internal schema) 1 2 Stored database (Internal View) 1 -8 3 # . . . 100 & @
DBMS based Information Systems: Basic Approach - Integration – (1) Integration of Information • Description of the integrated view of data is the "Conceptual Schema" of the database DBMS Application program End-user 1 -9
DBMS based Information Systems: Basic Approach – Simple views and High level language – (2) Provide simple views (External Schema) and high level language (e. g. SQL) for users to manipulate (handle) data • High level language: e. g. SQL (Structured Query Language) <e. g. >: SELECT SNAME FROM S WHERE S#= 'S 4'; • Description of user's view of data is the "external schema" or "subschema" or "view". • High-level languages (Query Language): SQL (1) Data Definition Language: define format (2) Data Manipulation Language: retrieve, insert, delete, update • Emphasize: EASE OF USE !! 1 -10 S S# name
DBMS based Information Systems: Basic Approach - Storage/Access Method – (3) Efficient Storage/Access Techniques: • implemented once rather than duplicated in all application programs. User: query in SQL Language Processor DBMS Access Methods Calls Access Method (B+ tree, Dynamic Hashing) I/O calls 1 -11
DBMS based Information Systems: Basic Approach - Transaction Management – (4) Provide Transaction Management: • Concurrency Control • Recovery • Security • . : 1 -12
Example: A Simple Query Processing Query in SQL: SELECT CUSTOMER. NAME FROM CUSTOMER, INVOICE WHERE REGION = 'N. Y. ' AND AMOUNT > 10000 AND CUTOMER. C#=INVOICE. C DBMS Language Processor Internal Form : ( (S SP) Optimizer Language Processor Operator : SCAN C using region index, create C SCAN I using amount index, create I SORT C? and I? on C# JOIN C? and I? on C# EXTRACT name field Operator Processor Calls to Access Method: OPEN SCAN on C with region index GET next tuple. . . Access Method e. g. B-tree; Index; Hashing Calls to file system: GET 10 th to 25 th bytes from block #6 of file #5 File System 1 -13 database Access Method
1. 2 An Overview of a Database System 1 -14
An Example • The Wine Cellar Database: Cellar: Bin Wine 2 3 6 12 16 43 50 51 Producer Year Bottle Ready Chardonnay Buena Vista 83 Chardonnay Louis Martini Chardonnay Chappellet Jo. Riesling Buena Vista Jo. Riesling Sattui Cab. Sauvignon Robt. Mondavi Pinot Noir Mirassou Pinot Noir Ch. St. Jean 1 81 82 82 82 77 77 78 85 5 4 1 1 12 3 2 • Retrieval: • DML (Data Manipulation Language): SELECT Wine, Bin, Producer FROM Cellar WHERE Ready = 85; • Result: Wine Bin Chardonnay 2 Chardonnay 6 Pinot Noir 50 Producer Buena Vista Chappellet Mirassou 1 -15 84 85 83 83 87 85 86 Comments Thanksgiving Late Harvest Very dry Harvest
An Example (cont. ) • Deletion: • DML: DELETE FROM Cellar WHERE Ready < 86; • Result: Bin Wine 43 Cab. Sauvignon 51 Pinot Noir Producer Year Bottle Ready Comments Robt. Mondavi 77 Ch. St. Jean 78 12 2 87 86 • Insertion: • DML: INSERT INTO Cellar VALUES (53, 'Pinot Noir', 'Franciscan', 79, 1, 86, 'for Joan'); • Result: Bin Wine 43 Cab. Sauvignon 51 Pinot Noir 53 Pinot Noir Producer Year Bottle Ready Robt. Mondavi 77 Ch. St. Jean 78 Franciscan 79 1 -16 12 2 1 87 86 86 Comments for Joan
An Example (cont. ) • Update • DML: UPDATE Cellar SET Bottles = 4 WHERE Bin = 51; • Result: Bin Wine 43 Cab. Sauvignon 51 Pinot Noir 53 Pinot Noir Producer Year Bottle Ready Comments Robt. Mondavi 77 Ch. St. Jean 78 Franciscan 79 12 1 -17 4 1 87 86 86 for Joan
What is a Database System? DBMS Application program End-user 1 -18
What is a Database System? (cont. ) • Major components of a database system: • • Data: integrated and shared. Hardware: disk, CPU, Main Memory, . . . Software: DBMS Users: 1. Application programmers 2. End users 3. Database administrator (DBA) » Defining external schema » Defining conceptual schema » Defining internal schema » Liaison with users » Defining security and integrity checks » Defining backup and recovery procedures » Monitoring performance and changing requirements 1 -19
Why Database ? • • Redundancy can be reduced Inconsistency can be avoided The data can be shared Standards can be enforced Security restrictions can be applied Integrity can be maintained Provision of data independence objective ! 1 -20
Data Independence • Application Program Data Structure • Immunity of application to change in storage structure and access strategy. 1 -21
Data Dependence vs. Data Independence S • Data Dependent S# S 1 S 2 : Sn e. g. SELECT CITY FROM S WHERE ITEM = 'X'; • Linked list: TOP S Top s 1 sn s 2 if item = TOP��. item then. . • Tree: . if item < root. data then root : = root�. left. . • Array: if A[I] = item then. . . • Storage structure changed program changed 1 -22
1. 3 An Architecture for a Database System 1 -23
User A 1 User A 2 User B 1 User B 2 User B 3 Host Language + DSL Host Language + DSL (Data Sub Language) e. g. SQL, QUEL External View @ #A & External schema A External schema B External/conceptual mapping A Conceptual schema < External/conceptual mapping B Conceptual @ View #A & Database management system (DBMS) Conceptual/internal mapping DBA (Build and maintain schemas and mappings) External View B Storage structure definition (Internal schema) Stored database (Internal View) 1 -24 # & @
An Example of the Three Levels • Internal level: STORED_EMP length = 18 PREFIX TYPE = BYTE(6), OFFSET = 0, INDEX = EMPX EMP# TYPE = BYTE(6), OFFSET=0, DEPT# TYPE = BYTE(4), OFFSET = 12 PAY TYPE = FULLWORD, OFFSET = 16 • Conceptual level: EMPLOYEE_NUMBER DEPARTMENT_NUMBER SALARY CHARACTER(6) CHARACTER(4) NUMERIC(5) • External level: – ( PL /I ) DCL 1 EMP 2 EMP# CHAR(6) 2 SAL FIXED BIN(31) – (COBOL) 01 EMPC 02 EMPNO PIC X(4) 02 DEPTNO PIC X(4) 1 -25
Functions of the DBMS • • • Data Definition Language (DDL) Data Manipulation Language (DML) Data Security and Integrity Data Recovery and Concurrency Data Dictionary Performance 1 -26
1. 4 Database Technology Trends 1960 s to Mid-1970 s to Mid-1980 s Late 1980 s Future Merging data models, knowledge representation, and programming languages Network Hierarchical Relational Semantic Object-oriented Logic Database Hardware Mainframes Minis PCs Faster PCs Workstations Database machines User Interface None Forms Query languages - SQL, QUEL Graphics Menus Query-by-forms Natural language Speech input Program Interface Procedural 4 GL Logic programming Integrated database and programming language Presentation and display processing Reports Processing data Business graphics Image output Knowledge processing Generalized display managers Distributed knowledge processing Data Model Embedded query language Report generators Information and transaction processing 1 -27 Parallel processing Optical memories
Distributed Databases Database Distributed System • Distributed database is a database that is not stored in its entirety at a single physical location, but rather is spread across a < e. g. > Hsinchu network of computers Taipei Kaoshiung communication links Taichung 1 -28
Distributed Databases (cont. ) • Advantages: – efficiency of local processing – data sharing • Disadvantages: – communication overhead – implementation difficulties • Reference: S. Ceri and G. Pelagatti "Distributed Databases: principles and systems" 1 -29
Multi-Database/Heterogeneous Database Multidatabase IMS Hierarchical Model INGRES Relational Model • semantic inconsistency • data incompleteness • global schema 1 -30 . . . ORION Object. Oriented Model
DB + AI Query Database DBMS Language Processor Query Optimizer Knowledge Base Operator Processor Access Method File Manager Logical DB design Distributed DB design Knowledge Base 1 -31 AI
KBMS Database Logic – A Combined Model : Logic Programming + Relational DB Query : ? : - ancestor (taro, Y) ? : - grandfather (? , c) User Program – Three layers : Knowledge management program IDB relational interface IDB: ancestor(X, Y): - parent(X, Y) ancestor(X, Y) : - parent(X, Z), ancestor(Z, Y) parent(X, Y): -edb(father(X, Y)) parent(X, Y): -edb(mother(X, Y)) grandfather(X, Z): - father(X, Y) ^ father(Y, Z) father Relational DB management program EDB 1 -32 EDB: mother father son A B X Y. . . B C .
Database Object-Oriented OODB • A typical Document : MEMO [Woelk 86, SIGMOD] MCC To: W. Kim From: D. Woolk Date: September 18, 1992 Subject: Workstations can be heard In the computer center of National Chiao-Tung University, there a lot of workstations. There are HP RS serials, SUNs, Apollo, and so on. The students in NCTU learn to use workstation since they are freshmen. The configuration of the workstations follows: Workstation . �speaker voice message associated . . Workstation } text } } graphics Database Server In the course �introduction to Computer Science? students do their homework's on workstations. image 1 -33
Use of a Database Management System in Design and Application Database Detailed Design Manufacturing Design Release Control Fabrication Assembly Info Analysis Models DBMS Database Management System Preliminarydesign INTERFACE Graphic Interface Test / Inspection Language Interface APPLICATION Design Analysis Design Verification Manufacturing Planning 1 -34 Evaluation Production Control Synthesis Release Project Management
Fuzzy Database – Fuzzy Query <e. g. > SELECT FROM WHERE STUDENT. NAME STUDENT SEX = M� AND HEIGH = TALLER AND WEIGH = SLIMMER STUDENT: NAME SEX HEIGHT WEIGHT IQ Mary Linda. . . <e. g. > F F. . . 158 165. . . SELECT FROM WHERE 55 55. . . High Medium. . . STUDENT. NAME STUDENT IQ >= 130 1 -35 Fuzzy Set Theory
More? AI DB 1993 2001 ? VR DB DB 1997 ? 2003 DNA/Bio. Info. ? DB WWW 2004 ? 2006 ? 1 -36 ? ? ? SARS ? DB Mobile/video DB 911/ Anthrax DB ? 1995 1999 DB ? ?
- Slides: 36