DCT 2033 DATABASE MANAGEMENT SYSTEM Chapter 1 File

DCT 2033 DATABASE MANAGEMENT SYSTEM Chapter 1 File Systems and Databases

Objective � The difference between data and information � What a database is, about different types of databases, and why they are valuable assets for decision making. � Why database design is important � How modern database evolves from files and file systems � About flaws in file system data management � How a database system differs from a file system and how a DBMS functions within the database system. � Why data models are important � About the basic data-modeling building blocks � What business rules are and how they affect database design. � How the major data models evolved, and their advantages and disadvantages. � How data models can be classified by level of abstraction 2

Examples of Database Applications � Purchases from the supermarket � Purchases using your credit card � Booking a holiday at the travel agents � Using the local library � Taking out insurance � Renting a video � Using the Internet � Studying at university 3

Data vs. Information �Data: �Raw facts; building blocks of information �Unprocessed information �Information: �Data processed to reveal meaning �Accurate, relevant, and timely information is key to good decision making �Good decision making is the key to survival in a global environment 4

Transforming Raw Data into Information 5

Transforming Raw Data into Information (continued) 6

Transforming Raw Data into Information (continued) 7

Transforming Raw Data into Information (continued) 8

Introducing the Database and the DBMS �Database—shared, integrated computer structure that stores: �End user data (raw facts) �Metadata (data about data) 9

Introducing the Database and the DBMS (continued) �DBMS (database management system): �Collection of programs that manages database structure and controls access to data �Possible to share data among multiple applications or users �Makes data management more efficient and effective 10

Importance of DBMS • It helps make data management more efficient and effective. • Its query language allows quick answers to ad hoc queries. • It provides end users better access to more and better-managed data. • It promotes an integrated view of organization’s operations -- “big picture. ” • It reduces the probability of inconsistent data. • End users have better access to more and better-managed data � Promotes integrated view of organization’s operations � Probability of data inconsistency is greatly reduced � Possible to produce quick answers to ad hoc queries 11

The DBMS Manages the Interaction Between the End User and the Database 12

Database Systems • 4 Types of Database Systems: 1) Number of Users �Single-user � Desktop • Centralized database �Multiuser � Workgroup � Enterprise 2) Scope �Desktop �Workgroup �Enterprise 13 3) Location database • Distributed 4) Use • Transactional (Production) • Decision support • Data warehouse

Types of Databases �Single-user: �Supports only one user at a time �Desktop: �Single-user database running on a personal computer �Multi-user: �Supports multiple users at the same time 14

Types of Databases (continued) �Workgroup: �Multi-user database that supports a small group of users or a single department �Enterprise: �Multi-user database that supports a large group of users or an entire organization 15

Types of Databases (continued) Can be classified by location: �Centralized: �Supports data located at a single site �Distributed: �Supports data distributed across several sites 16

Types of Databases (continued) Can be classified by use: �Transactional (or production): �Supports a company’s day-to-day operations �Data warehouse: �Stores data used to generate information required to make tactical or strategic decisions �Often used to store historical data �Structure is quite different 17

Why Database Design is Important �Defines the database’s expected use �Different approach needed for different types of databases �Avoid redundant data �A well-designed database facilitates data management and becomes a valuable information generator. �Poorly designed database generates errors leads to bad decisions can lead to failure of organization 18

Historical Roots: Files and File Systems Manual File systems: � Collection of file folders kept in file cabinet � Organization within folders based on data’s expected use (ideally logically related) � System adequate for small amounts of data with few reporting requirements � Finding and using data in growing collections of file folders became time-consuming and cumbersome � Collection of application programs that perform services for the end users (e. g. reports). � Each program defines and manages its own data. 19

Limitations of File-Based Approach �Separation and isolation of data �Each program maintains its own set of data. �Users of one program may be unaware of potentially useful data held by other programs. �Duplication of data �Same data is held by different programs. �Wasted space and potentially different values and/or different formats for the same item. 20

Limitations of File-Based Approach �Time-consuming, high-level activity �As number of files expands, system administration becomes difficult �Making changes in existing file structure is difficult �File structure changes require modifications in all programs that use data in that file 21

Limitations of File-Based Approach �Modifications are likely to produce errors, requiring additional time to “debug” the program �Security features hard to program and therefore often omitted �Data dependence �Incompatibility of files 22

Files and File Systems 23

Database Systems �Problems inherent in file systems make using a database system desirable �File system �Many separate and unrelated files �Database �Logically related data stored in a single logical data repository 24

Database Systems 25

Database System 26

Basic File Terminology 27

The Database System Environment • Database system is composed of 5 main parts: 1. Hardware § § Computer Peripherals 4. Procedures • 2. Software § § § Operating system software DBMS software Application programs and utility software 3. People • • 28 • Systems administrators Database administrators (DBAs) Database designers Systems analysts and programmers End users Instructions and rules that govern the design and use of the database system 5. Data § Collection of facts stored in the database

29

Database Systems § The complexity of database systems depends on various organizational factors: § Organization’s size § Organization’s function § Organization’s corporate culture § Organizational activities and environment § Database solutions must be cost effective AND strategically effective. 30

DBMS Functions � DBMS performs functions that guarantee integrity and consistency of data : 1. Data dictionary management defines data elements and their relationships � any changes made in a database structure automatically recorded in the data dictionary. � 2. Data storage management � 31 stores data and related data entry forms, report definitions, data validation rules, procedural code etc.

DBMS Functions (continued) 3. Data transformation and presentation � 4. Security management � 32 translates logical requests into commands to physically locate and retrieve the requested data enforces user security and data privacy within database

DBMS Functions (continued) 5. Multiuser access control �uses sophisticated algorithms to ensure multiple users can access the database concurrently without compromising the integrity of the database 6. Backup and recovery management �provides backup and data recovery procedures 7. Data integrity management �promotes and enforces integrity rules 33

DBMS Functions (continued) 8. Database access languages and application programming interfaces �provide data access through a query language �DBMS query languages contains 2 component: data definition languages & data manipulation languages 9. Database communication interfaces �allow database to accept end-user requests via multiple, different network environments 34

DBMS Functions (continued) 35

DBMS Functions (continued) 36

Advantages of DBMS 1. 2. 3. 4. 5. 6. 7. 8. 37 Control of the redundancy Data consistency Sharing of data Improved data intergrity Improved security Enforcement of standard More information from the same amount of data Economy scale

Advantages of DBMS 9. Balance of conflicting requirements 10. Improved data accessibility and sponsiveness 11. Increase productivity 12. Improved backup and recovery services 38

Disadvantages of DBMS 1. 2. 3. 4. 5. 6. 39 Complexity, size Cost of DBMS Additional hardware costs Cost of conversion Performance Higher impact of a failure

Summary �Data are raw facts. Information is the result of 40 processing data to reveal its meaning. �To implement and manage a database, use a DBMS. �Database design defines the database structure. �A well-designed database facilitates data management and generates accurate and valuable information. �A poorly designed database can lead to bad decision making, and bad decision making can lead to the failure of an organization.

Summary (continued) �Databases were preceded by file systems. �Limitations of file system data management: �requires extensive programming �system administration complex and difficult �making changes to existing structures is difficult �security features are likely to be inadequate �independent files tend to contain redundant data �DBMS’s were developed to address file systems’ inherent weaknesses 41

Data Models � Definition: is the relatively simple representation, usually graphical, of real-world data structures. � Function: understand the complexities of the real-world environment. � Two components of data models: 1) Structure - refers to the way the system structure data or, the way the users feel the data is structured. 2) Operation - facilities given to the user of the DBMS to manipulate data within the database. 42

Data Model Basic Building Blocks � Basic building blocks of data model: entity, attributes, relationship � Entity is anything about which data are to be collected and stored(person, places, things, or event for which data is collected) � Attribute is a characteristic/ property of an entity � Relationship describes an association among (two or more) entities �One-to-many (1: M) relationship �Many-to-many (M: N or M: M) relationship 43 �One-to-one (1: 1) relationship

Data Model Basic Building Blocks � Three Types of Relationships: �One-to-many relationships (1: M) �A painter paints many different paintings, but each one of them is painted by only that painter. � PAINTER (1) paints PAINTING (M) �Many-to-many relationships (M: N) �An employee might learn many job skills, and each job skill might be learned by many employees. � EMPLOYEE (M) learns SKILL (N) �One-to-one relationships (1: 1) �Each store is managed by a single employee and each store manager (employee) only manages a single store. � EMPLOYEE (1) manages STORE (1) 44

The Evolution of Data Models � FOUR models/ categories of Implementation Data Models �Hierarchical �Network �Relational �Object oriented 45

46

Hierarchical Model § Basic Structure § Collection of records logically organized to conform to the upside-down tree (hierarchical) structure. § The top layer is perceived as the parent of the segment directly beneath it. § The segments below other segments are the children of the segment above them. § A tree structure is represented as a hierarchical path on the computer’s storage media. 47

The Hierarchical Model § Characteristics �Basic concepts form the basis for subsequent database development �Limitations lead to a different way of looking at database design �Basic concepts show up in current data models �Best understood by examining manufacturing process 48

The Hierarchical Model § Advantages Ø Conceptual simplicity Ø Database security Ø Data independence Ø Database integrity Ø Efficiency dealing with a large database 49

The Hierarchical Model § Disadvantages Ø Complex implementation Ø Difficult to manage Ø Lacks structural independence Ø Applications programming and use complexity Ø Implementation limitations Ø Lack of standards 50

Network Database Models § Basic Structure �Set -- A relationship is called a set. Each set is composed of at least two record types: an owner (parent) record and a member (child) record. �A set is represents a 1: M relationship between the owner and the member. 51

52

The Network Data Model § Advantages Ø Conceptual simplicity Ø Handles more relationship types Ø Data access flexibility Ø Promotes database integrity Ø Data independence Ø Conformance to standards § Disadvantages Ø System complexity Ø Lack of structural independence 53

Relational Database Models �Basic Structure �RDBMS allows operations in a human logical environment. �The relational database is perceived as a collection of tables. �Each table consists of a series of row/column intersections. �Tables (or relations) are related to each other by sharing a common entity characteristic. �The relationship type is often shown in a relational schema. �A table yields complete data and structural independence. 54

55

�Advantages �Structural independence �Improved conceptual simplicity �Easier database design, implementation, management & use �Ad hoc query capability (SQL) �Powerful database management system �Disadvantages �Substantial hardware and system software overhead �Possibility of poor design and implementation �Potential “islands of information” problems 56

Entity-Relationship Data Models � It is one of the most widely accepted graphical data modeling tools. �It graphically represents data as entities and their relationships in a database structure. �It complements the relational data model concepts. 57

Basic Structure of ER Model �E-R models are normally represented in an entity relationship diagram (ERD). �An entity is represented by a rectangle. �Each entity is described by a set of attributes. An attribute describes a particular characteristics of the entity. �A relationship is represented by a diamond connected to the related entities. 58

59

60

�Advantages of the ER model �Exceptional conceptual simplicity �Visual representation �Effective communication tool �Integrated with the relational database model �Disadvantages of the ER model �Limited constraint representation �Limited relationship representation �No data manipulation language �Loss of information content 61

Data Redundancy �Data redundancy results in data inconsistency �Different and conflicting versions of the same data appear in different places �Errors more likely to occur when complex entries are made in several different files and/or recur frequently in one or more files �Data anomalies develop when required changes in redundant data are not made successfully 62

Data Redundancy Types of data anomalies: �Update anomalies �Occur when changes must be made to existing records �Insertion anomalies �Occur when entering new records �Deletion anomalies �Occur when deleting records 63

Advantages of Database Processing 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 64 Economy pf scale Getting more information from same amount of data Sharing of data Balancing conflicting requirements Enforcement of standards Controlled redundancy Consistency Integrity Security Flexibility and responsiveness Increased programmer productivity Improved program maintenance Data independence

Disadvantages of Database Processing 1. 2. 3. 4. 5. 6. 65 Size Complexity Cost of DBMSs Additional hardware requirements Higher impact of a failure Recovery more difficult

Recap…. � � � 66 Data, information, metadata, database, DBMS function. Entities, attributes, relationship Files, records, fields Data models(relational model, network model, hierarchical model, object-oriented model) Advantages and disadvantages of database processing
- Slides: 66