12 Chapter 12 Distributed Database Management Systems Database
12 Chapter 12 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel
12 In this chapter, you will learn: • What a distributed database management system (DDBMS) is and what its components are • How database implementation is affected by different levels of data and process distribution • How transactions are managed in a distributed database environment • How database design is affected by the distributed database environment Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 2
The Evolution of Distributed Database Management Systems 12 • Distributed database management system (DDBMS) – Governs storage and processing of logically related data over interconnected computer systems in which both data and processing functions are distributed among several sites • Centralized database required that corporate data be stored in a single central site • Dynamic business environment and centralized database’s shortcomings spawned a demand for applications based on data access from different sources at multiple locations (PDAs for example) Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 3
Centralized database management system Request DBMS Reply 12 Application issues a data request to the DBMS Data Read Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 4
Centralized database management problems 12 • Performance degradation due to a growing number of remote locations • High costs (mainframe) • Reliability problems (single point of failure syndrome) • Scalability problems - single location • Organizational rigidity – no flexibility and agility Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 5
Distributed Processing and Distributed Databases 12 • Distributed processing – Database’s logical processing is shared among two or more physically independent sites – Connected through a network – For example, the data input/output (I/O), data selection, and data validation might be performed on one computer, and a report based on that data might be created on another computer (see figure 12. 2) – Distributed processing does not require a distributed database • Distributed database – Stores logically related database over two or more physically independent sites – Database composed of database fragments – Distributed database requires distributed processing (each database fragment is managed by its own local database process) Database Systems, 8 th. Design, Edition Systems: Implementation, & Management, 7 th Edition, Rob & Coronel 66
12 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 7
12 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 8
Tue 1 -4 DDBMS Advantages 12 • Advantages include: – Data are located near “greatest demand” site – Faster data access – Faster data processing – Growth facilitation: New sites can be added to the network without affecting the operations of other sites. – Improved communications: Because local sites are smaller and located closer to customers – Reduced operating costs: Add workstation not mainframe – User-friendly interface: Easy training – Less danger of a single-point failure – Processor independence: end user is able to access any available copy of the data, and an end user’s request is processed by any processor at the data location. Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 9
DDBMS Disadvantages 12 • Disadvantages include: – Complexity of management and control – Security – Lack of standards – No compatibility – Increased storage requirements: Multiple copies of data are required at different sites – Increased training cost Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 10
DDBMS Components 12 • Must include (at least) the following components: – Computer workstations – Network hardware ( gateways, routers, network bridges, switches, hubs ) and software – Communications media (cables, microwave, fiber optics, satellite) – Transaction processor (also known as application processor, transaction manager) • Software component found in each computer that requests data (receives and processes the application’s data requests (remote and local)) – Data processor or data manager • Software component residing on each computer that stores and retrieves data located at the site • May be a centralized DBMS Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 11
DDBMS Components (continued) Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 12 12
12 Levels of Data and Process Distribution • Current systems classified by how process distribution and data distribution supported Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 13
Single-Site Processing, Single-Site Data (SPSD) 12 • All processing is done on single CPU or host computer (mainframe, midrange, or PC) • All data are stored on host computer’s local disk • Processing cannot be done on end user’s side of system. several processes to run concurrently on a host computer accessing a single DP • Typical of most mainframe and midrange computer DBMSs • DBMS is located on host computer, which is accessed by dumb terminals connected to it Database Systems, 8 th. Design, Edition Systems: Implementation, & Management, 7 th Edition, Rob & Coronel 1414
Single-Site Processing, Single-Site Data(SPSD) 12 TP and the DP are embedded within the DBMS Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 1515
Multiple-Site Processing, Single-Site Data (MPSD) 12 • Multiple processes run on different computers sharing single data repository • The end user sees the file server as just another hard disk. Because only the data storage input/output (I/O) is handled by the file server’s computer. • All record- and file-locking activities are done at the end-user location. All data selection, search, and update functions take place at the workstation, thus requiring that entire files. • travel through the network for processing at the workstation • MPSD scenario requires network file server running conventional applications that are accessed through LAN • Many multi-user accounting applications, running under personal computer network, fit such a description Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 16
12 SELECT * FROM CUSTOMER WHERE CUS_BALANCE > 1000; All 10, 000 CUSTOMER rows must travel through the network to be evaluated at site A, even if 50 of them have balances greater than $1, 000 Client/server architecture is similar to that of the network file server except that all database processing is done at the server site, thus reducing network traffic. Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 17
Multiple-Site Processing, Multiple-Site Data (MPMD) 12 • Fully distributed database management system with support for multiple data processors and transaction processors at multiple sites • Classified as either homogeneous or heterogeneous • Homogeneous DDBMSs – Integrate only one type of centralized DBMS over a network Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 18
Multiple-Site Processing, Multiple-Site Data (MPMD) (continued) 12 • Homogeneous DDBMSs integrate only one type of centralized DBMS over a network • Heterogeneous DDBMSs – Integrate different types of centralized DBMSs over a network • Fully heterogeneous DDBMS – Support different DBMSs that may even support different data models (relational, hierarchical, or network) running under different computer systems, such as mainframes and microcomputers Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 19
12 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 20
12 Distributed Requests and Distributed Transactions • Remote request: single SQL statement accesses data from single remote database • Remote transaction: accesses data at single remote site • Distributed request: single SQL statement references data at several DP sites • Distributed transaction: requests data from several different remote sites on network Database Systems, 8 th. Design, Edition Systems: Implementation, & Management, 7 th Edition, Rob & Coronel 2121
12 Remote request Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 22
12 Remote transaction Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 23
12 Distributed request Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 24
12 Distributed Requests and Distributed Transactions (continued) Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 25
12 Distributed Transactions Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 26
12 Data Fragmentation • Breaks single object ( Db or table) into two or more segments or fragments • Each fragment can be stored at any site over computer network • Information about data fragmentation is stored in distributed data catalog (DDC), from which it is accessed by TP to process user requests Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 27
Data Fragmentation (continued) 12 • Strategies (based at table level) – Horizontal fragmentation • Division of a relation into subsets (fragments) of tuples (rows) • Each fragment represents the equivalent of a SELECT statement, with the WHERE clause on a single attribute. – Vertical fragmentation • Division of a relation into attribute (column) subsets • This is the equivalent of the PROJECT statement in SQL. – Mixed fragmentation • Combination of horizontal and vertical strategies • A table may be divided into several horizontal subsets (rows), each one having a subset of the attributes (columns). Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 28
Data Fragmentation (continued) Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 12 29
Data Fragmentation (continued) 12 Company’s corporate management requires information about its customers in all three states, but company locations in each state (TN, FL, and GA) require data regarding local customers only. Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 30
Data Fragmentation (continued) 12 Each horizontal fragment may have a different number of rows, but each fragment must have the same attributes. Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 31
Data Fragmentation (continued) 12 Suppose the company is divided into two departments: the service department and the collections department. Each department is located in a separate building, and each has an interest in only a few of the CUSTOMER table’s attributes. Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 32
Data Fragmentation (continued) 12 Each vertical fragment must have the same number of rows, but the inclusion of the different attributes depends on the key column (CUS_NUM) Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 33
Data Fragmentation (continued) 12 Company’s structure requires that the CUSTOMER data be fragmented horizontally to accommodate the various company locations; within the locations, the data must be fragmented vertically to accommodate the two departments (service and collection). Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 34
Data Fragmentation (continued) Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 12 35
Sun 14 -7 Data Replication 12 • Storage of data copies at multiple sites served by computer network • Fragment copies can be stored at several sites to serve specific information requirements – Can enhance data availability and response time – Can help to reduce communication and total query costs Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 36
12 Data Replication (continued) Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 37
Thu 11 -12 Data Replication (continued) 12 • Replication scenarios – Fully replicated database • Stores multiple copies of each database fragment at multiple sites • Can be impractical due to amount of overhead – Partially replicated database • Stores multiple copies of some database fragments at multiple sites • Most DDBMSs are able to handle the partially replicated database well – Unreplicated database • Stores each database fragment at single site • No duplicate database fragments Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 38
- Slides: 38