Replication Replication is keeping copies of the same

  • Slides: 8
Download presentation
Replication

Replication

Replication is keeping copies of the same data at different sites. Pro - increases

Replication is keeping copies of the same data at different sites. Pro - increases availability (and safety) of the data Con - increases difficulty of data modifications (consistency) Replication can either be complete (entire database at each site) or partial (each site only contains some of the database). Partial replication requires partitioning to determine what data goes where.

Partitioning Horizontal partitioning - partitioning a table by records. Certain rows are keep at

Partitioning Horizontal partitioning - partitioning a table by records. Certain rows are keep at each site. For instance, the CSE server holds the records of students in the major, and the MSU server holds the records of undeclared students. All sites share the same schema, but not the same rows. Vertical partitioning - partition a table according to column (decomposition). For instance, the CSE server holds the capstone data, and the MSU server holds tuition data. Hybrid partitioning - a combination of horizontal and vertical

Types of Data Distributions Complete Replication - each site has the complete database Partitioned

Types of Data Distributions Complete Replication - each site has the complete database Partitioned - each site has a fragment of the database, but each fragment only exists at one site. Partial Replication - each site has a fragment of the database, but fragment exists in multiple copies across the sites.

Difficult of Query Processing Complete Replication - each site has the complete database so

Difficult of Query Processing Complete Replication - each site has the complete database so query processes is easy Partitioned - information is needed about what data is at each site to handle queries on data not present at the local site, so more difficult Partial Replication - same as partitioned

Difficult of Concurrency Control Complete Replication - simultaneous reads can be allowed at each

Difficult of Concurrency Control Complete Replication - simultaneous reads can be allowed at each replica, simultaneous writes can only be permitted at one replica. Any write needs to be propagated to all locations. Moderately difficult. Partitioned - Because the data isn't duplicated, this is as easy as a centralized database. Partial Replication - same as complete replication, but with added difficulty of network communication.

Difficult of Maintaining Reliability Complete Replication - multiple copies increase availability because failure of

Difficult of Maintaining Reliability Complete Replication - multiple copies increase availability because failure of any site won't interrupt service. Very low difficulty. Partitioned - Each datum is unique meaning every site is a potential point of failure. Extremely difficult to maintain reliability. Partial Replication - same as complete replication, but with added difficulty of network communication.

Which Data Distribution is best for large data? 1. Complete Replication 2. Partitioned 3.

Which Data Distribution is best for large data? 1. Complete Replication 2. Partitioned 3. Partial Replication 4. Depends