Distributed Concurrency Control Motivation Worldwide telephone system Worldwide

Distributed Concurrency Control

Motivation • • World-wide telephone system World-wide computer network World-wide database system Collaborative projects – the project has a database composed of smaller local databases of each researcher • A travel company organizing vacation – it consults a local subcontractors (local companies), which list prices and quality ratings for hotels, restaurants, and fares • A library service – people looking for articles query two or more libraries 2

Types of distributed systems • Homogeneous federation – servers participating in the federation are logically part of a single system; they all run the same suite of protocols, and they may even be under the control of a „master site” • Homogeneous federation is characterized by distribution transparency • Heterogeneous federation - servers participating in the federation are autonomous and heterogeneous; they may run different protocols, and there is no „master site” 3

Types of transactions and schedules • Local transactions • Global transactions 4

Concurrency Control in Homogeneous Federations

Preliminaries • Let the federation consists of n sites, and let T = {T 1, . . . , Tm} be a set of global transactions • Let s 1, . . . , sn be local schedules • Let D = Di, where Di is a local database at site i • We assume no replication (each replica is treated as a separate data item) • A global schedule for T and s 1, . . . , sn is a schedule s for T such that its local projection equals the local schedule at each site, i. e. i(s) = si for all i, 1 i n 6

Preliminaries • i(s) denotes the projection of the schedule s onto site i • We call the projection of a transaction T onto site i a subtransaction of T (Ti), which comprises all steps of T at the site i • Global transactions formally have to have Commit operations at all sites at which they are active • Conflict serializability – a global [local] schedule is globally [locally] conflict serializable if there exists a serial schedule over the global [local] (sub-) transactions that is conflict equivalent to s 7

Example 1 • Consider the federation of two sites, where D 1 = (x) and D 2 = (y). Then, s 1 = r 1(x) w 2(x) and s 2 = w 1(y) r 2(y) are local schedules, and s = r 1(x) w 1(y) w 2(x) c 1 r 2(y) c 2 is a global schedule • 1(s) = s 1 and 2(s) = s 2 • Another form of the schedule server 1: r 1(x) server 2: w 2(x) w 1(y) r 2(y) 8

Example 2 • Consider the federation of two sites, where D 1 = (x) and D 2 = (y). Assume the following schedule server 1: r 1(x) server 2: w 2(x) r 2(y) w 1(y) The schedule is not conflict serializable since the conflict serialization graph will have a cycle 9

Global conflict serializability Let s be a global schedule with local schedules s 1, s 2, . . . , sn involving a set T of transactions such that each si, 1 i n, is conflict serializable. Then, the following holds: s is globally conflict serializable iff there exists a total order ‘ < ‘ on T that is consistent with the local serialization orders of the transactions (proof) 10

Concurrency Control Algorithms • Distributed 2 PL locking algorithms • Distributed T/O algorithms • Distributed optimistic algorithms 11

Distributed 2 PL locking algorithms • The main problem is how to determine that a transaction has reached its ‘lock point’? • Primary site 2 PL – lock management is done exclusively at a a distinguished site – primary site • Distributed 2 PL – when a server wants to start unlocking phase for a transaction, it communicates with all other servers regarding the locking point of that transaction • Strong 2 PL – all locks acquired on behalf of a transaction are held until the transaction wants to commit (2 PC) 12

Distributed T/O algorithms • Assume that each local site (scheduler) executes its private T/O protocol for synchronizing accesses in its portion of the database: server 1: r 1(x) server 2: w 2(x) r 2(y) w 1(y) If timestamps were assigned as in the centralized case, each of the two servers would assign a value 1 to the first transaction that it sees locally; T 1 on the server 1 and T 2 on the server 2, which would lead to globally incorrect result 13

Distributed T/O algorithms • We have to find a way to assign globally unique timestamps to transactions at all sites: – Centralized approach – a particular server is responsible for generating and distributing timestamps – Distributed approach – each server generates a unique local timestamp using a clock or counter server 1: r 1(x) server 2: TS(T 1) =(1, 1) TS(T 2) =(1, 2) w 2(x) r 2(y) w 1(y) 14

Distributed T/O algorithms • Lamport clock – used to solve more general problem of fixing the notion of logical time in an asynchronous network – Sites communicate through messages – Logical time is a pair (c, i), where c is nonnegative integer and i is a transaction number – The clock variable gets increased by 1 at every transaction operation; the logical time of the operation is defined as the value of the clock immediately after the operation 15

Distributed optimistic algorithms • Under optimistic approach, every transaction is processed in three phases • Problem: how to ensure that validation comes to the same resultat every site where a global transaction has been active • Not implemented 16

Distributed Deadlock Detection • Problem: global deadlock, which cannot be detected by local means only (each server keeps a WFG locally) Site 1 Site 3 wait for message T 1 T 2 T 3 T 2 wait for lock T 3 Site 2 17

Distributed Deadlock Detection • Centralized detection – centralized monitor collecting local WFGs – performance – false deadlocks • Timeout approach • Distributed approaches: – Edge chasing – Path pushing 18

Distributed Deadlock Detection • Edge chasing – each transaction that becomes blocked in a wait relationship sends its identifier in a special message called probe to the blocking transaction. If a transaction receives a probe, it forwards it to all transactions by which it is itself blocked. If the probe comes back to the transaction by which it was initiated – this transaction knows that it is participating in a cycle and hence it is part of a deadlock 19

Distributed Deadlock Detection • • Path pushing – entire paths are circulated between transactions instead of single transaction identifiers. The basic algorithm is as follows: 1. 2. Each server that has a wait-for path from transaction Ti to transaction Tj such that Ti has an incoming waits-for message edge and Tj has an outgoing waits-for message edge sends that path to the server along the outgoing edge, provided the identifier of Ti is smaller than that of Tj Upon receiving a path, the server concatenates this with the local paths that already exists, and forwards the result along its outgoing edges again. If there exists a cycle among n servers, at least one of them will detect that cycle in at most n such rounds 20

Distributed Deadlock Detection • Consider the deadlock example: Site 1 T 1 Site 2 T 2 Site 3 T 2 T 1 Site 3 knows that T 3 global deadlock T 3 T 2 T 3 T 1 locally and detects 21

Concurrency Control in Heterogeneous Federations

Preliminaries • A heterogeneous distributed database system which integrates pre-existing external data sources to support global applications accessing more than one external data source • HDDBS vs LDBS • Local autonomy and heterogeneity of local data sources – Design autonomy – Communication autonomy – Execution autonomy • Local autonomy reflects the fact that local data sources were designed and implemented independently and were totally unaware of the integration process 23

Preliminaries • Design autonomy: it refers to the capability of a database system to choose its own data model and implementation procedures • Communication autonomy: it refers to the capability of a database system to decide what other systems it will communicate with and what information it will exchange with them • Execution autonomy: it refers to the capability of a database system to decide how and when to execute requests received from other systems 24

Difficulties • Actions of a transaction may be executed in different EDSs, one of which has system that use locks to guarantee the serializability, while another one may use timestamps • Guaranteeing the properties of transactions may restrict local autonomy, e. g. to guarantee the atomicity, the participating EDSs must execute some type of a commit protocol • EDSs may not provide the necessary functionality to implement the required global coordination protocols. Ref. To commit protocol, it is necessary for EDS to become prepared, guaranteeing that the local actions of a transaction can be completed. Existing EDSs may not allow a transaction to enter this state 25

HDDBS model Global transactions Global Transaction Manager (GTM) Local Transaction Manager (LTM) Local transactions External Data Source EDS 1 Local Transaction Manager (LTM) Local transactions External Data Source EDS 2 26

Basic notation • HDDBS consists of a set D of external data sources and a set of transactions T • D = {D 1, D 2, . . . , Dn} Di – i-th external data source • = T T 1 T 2 . . . Tn • T – a set of global transactions • Ti – a set of local transactions that access Di only 27

Example • Given a federation of two servers: D 1 = { a, b} D 2 = {c, d, e} D={a, b, c, d, e} • Local transactions: T 1 = r(a) w(b) T 2 = w(d) r(e) • Global transactions: T 3 = w(a) r(d) T 4 = w(b) r(c) w(e) • Local schedules: s 1: r 1(a) w 3(a) c 3 w 1(b) c 1 w 4(b) c 4 s 2: r 4(c) w 2(d) r 3(d) c 3 r 2(e) c 2 w 4(e) c 4 28

Global schedule Let the heterogeneous federation consists of n sites, and let T 1, . . . , Tn be sets of local transactions at sites 1, . . . , n, T be a set of global transactions. Finally, let s 1, s 2, . . . , sn. A (heterogeneous) global schedule (for s 1, . . . , sn) is a schedule s for such that its local projection equals the local schedule at each site, i. e. i(s) = si for all i, 1 i n 29

Correctness of schedules • Given a federation of two servers: D 1 = { a } D 2 = {b, c} • Given two global transactions T 1 and T 2 and a local transaction T 3: T 1 = r(a) w(b) T 2= w(a) r(c) T 3 = r(b) w(c) • Assume the following local schedules: indirect conflict server 1: r 1(a) w 2(a) server 2: r 3(b) w 1(b) r 2(c) w 3(c) • Transactions T 1 and T 2 are executed strictly serially at both sites – the global schedule is not globally serializable 30

Global serializability • In a heterogeneous federation GTM has no direct control over local schedules; the best it can do is to control the serialization order of global transactions by carefully controlling the order in which operations are sent to local systems for execution and in which these get acknowledged. • Indirect conflict: Ti and Tk are in indirect conflict in si if there exists a sequence T 1, . . . , Tr of transactions in si such that Ti is in si in a direct conflict with T 1; Tj is in si in a direct conflict with Tj+1, 1 j r-1, and Tr is in si in a direct conflict with Tk • Conflict equivalence: two schedules contain the same operations and the same direct and indirect conflicts 31

Global serializability • Global Conflict Serialization Graph: Let s be a global schedule for the local schedules s 1, s 2, . . . , sn; let G(si) denote the conflict serialization graph of si, 1 i n, derived from direct and indirect conflicts. The global conflict serialization graph of s is defined as the union of all G(si), 1 i n, i. e. • Global serializability theorem Let the local schedules s 1, s 2, . . . , sn be given, where each G(si), 1 i n, is acyclic. Let s be a global schedule for the si, 1 i n. The global schedule s is globally conflict serializable iff G(s) is acyclic 32

Global serializability - problems • To ensure the global serializability the serialization order of global transactions must be the same in all sites they execute to check • Serialization orders of local schedules must be validated by the HDDBS – These orders are neither reported by EDSs, nor – They can be determined by controlling the submission of the global subtransactions or observing their execution order 33

Example • Globall non-serializable schedule s 1: w 1(a) r 2(a) s 2: w 2(c) r 3(c) w 3(b) r 1(b) T 2 • Globally serializable schedule s 1: w 1(a) r 2(a) s 2: w 2(c) r 1(b) • Globall non-serializable schedule s 1: w 1(a) r 2(a) s 2: w 3(b) r 1(b) w 2(c) r 3(c) T 2 T 1 T 3 T 2 T 1 34

Quasi serializability • Rejecting global serializability as the correctness criterion • The basic idea: we assume that no value dependencies exist among EDSs so indirect conflicts can be ignored • In order to preserve global database consistency, only global transactions needs to be executed in a serializable way with proper consideration of the effects of local transactions 35

Quasi serializability • Quasi-serial schedule A set of local schedules {s 1, . . . , sn} is quasi serial if each si is conflict serializable and there exists a total order „<„ on the set T of global transactions such that Ti < Tj for Ti, Tj T, i j, implies that in each local schedule si, 1 i n, the Ti subtransaction occurs completely before Tj subtransaction • Quasi serializability A set of local schedules {s 1, . . . , sn} is quasi serializable if there exists a set {s 1’, . . . , sn’} of quasi serial local schedules such that si is conflict equivalent to si’ for 1 i n. 36

Example (1) • Given a federation of two servers: D 1 = { a, b } D 2 = {c, d, e} • Given two global transactions T 1 and T 2 and two local transactions T 3 and T 4: T 1 = w(a) r(d) T 2= r(b) r(c) w(e) T 3 = r(a) w(b) T 4= w(d) r(e) • Assume the following local schedules: s 1: w 1(a) r 3(a) w 3(b) r 2(b) s 2: r 2(c) w 4(d) r 1(d) w 2(e) r 4(e) 37

Example (2) • The set {s 1, s 2} is quasi serializable, since it is conflict equivalent to the quasi serial set {s 1, s 2’}, where s 2’ : w 4(d) r 1(d) r 2(c) w 2(e) r 4(e) • The global schedule s: w 1(a) r 3(a) r 2(c) w 4(d) r 1(d) c 1 w 3(b) c 3 r 2(b) w 2(e) c 2 r 4(e) c 4 is quasi serializable; however, s is not globally serializable • Since the quasi-serialization order is always compatible with the orderings of subtransactions in the various local schedules, quasi serializability is relatively easy to achieve for a GTM 38

Achieving Global Serializability through Local Guarantees - Rigorousness • GTM assume that local schedules are conflict serializable • There are various scenarios for guaranteeing global serializability • Rigorousness: local schedulers produce conflictserializable rigorous schedules. The schedule is rigorous if it satisfies the following condition: oi(x) <s oj(x), i j, oi, oj in conflict aj <s oj(x) or cj <s oj(x) • Schedules in RG avoid any type of rw, wr, or ww conflict between uncommitted transactions 39

Achieving Global Serializability through Local Guarantees - Rigorousness • Given a federation of two servers: D 1 = { a, b } D 2 = {c, d } • Given two global transactions T 1 and T 2 and two local transactions T 3 and T 4: T 1 = w(a) w(d) T 2= w(c) w(b) T 3 = r(a) r(b) T 4= r(c) r(d) • Assume the following local schedules: s 1: w 1(a) c 1 r 3(a) r 3(b) c 3 w 2(b) c 2 s 2: w 2(c) c 2 r 4(c) r 4(d) c 4 w 1(d) c 1 • Both schedules are rigorous, but they yield different serialization orders 40

Achieving Global Serializability through Local Guarantees - Rigorousness • Commit-deferred transactions: A global transaction T is commit-deferred if its commit operation is sent by GTM to local sites only after the local executions of all data operations from T have been acknowledged at all sites • Theorem: If si RG, 1 i n, and all global transactions are commit-deferred, then s is globally serializable 41

Possible solutions • Bottom-up approach: observing the execution of global transactions at each EDS. Idea: the execution order of global transactions is determined by their serialization orders at each EDS Problem: how to determine serialization order of gl. trans. • Top-down approach: controlling the submission and execution order of global transactions Idea: GTM determines a global serialization order for global transactions before submitting them to EDSs. It is EDSs responsibility to enforce the order at local sites Problem: how the order is enforced at local sites 42

Ticket-Based Method • How GTM can obtain information about relative order of subtransactions of global transactions at each EDSs? • How GTM can guarantee that subtransactions of each global transaction have the same relative order in all participating EDSs? • Idea: to force local direct conflicts between global transactions or to convert indirect conflicts (not observable by the GTM) into direct (observable) conflicts 43

Ticket-Based Method • Ticket: a ticket is a logical timestamp whose value is stored as a special data item in each EDS • Each subtransaction is required to issue the Take_A_Ticket operation: r(ticket) w(ticket+1) (critical section) • Only subtransactions of global transactions have to take tickets • Theorem: If global transaction T 1 takes its ticket before global transaction T 2 in a server, then T 1 will be serialized before T 2 by that server • or tickets obtained by subtransactions determine their relative serialization order 44

Example (1) • Given a federation of two servers: D 1 = { a } D 2 = {b, c } • Given two global transactions T 1 and T 2 and a local transaction T 3: T 1 = r(a) w(b) T 2= w(a) r(c) T 3 = r(b) w(c) • Assume the following local schedules: s 1: r 1(a) c 1 w 2(a) c 2 T 1 T 2 s 2: r 3(b) w 1(b) c 1 r 2(c) c 2 w 3(c) c 3 the schedule is not globally serializable: T 2 T 3 T 1 45

Example (2) • Using tickets, the local schedules look as follows s 1: r 1(I 1) w 1(I 1+1) r 1(a) c 1 r 2(I 1) w 2(I 1+1) w 2(a) c 2 s 2: r 3(b) r 1(I 2) w 1(I 2+1) w 1(b) c 1 r 2(I 2) w 2(I 2+1) r 2(c) c 2 w 3(c) c 3 • Indirect conflict between global transactions in the schedule s 2 has been turned into an explicit one; the schedule s 2 is not conflict serializable T 2 T 3 T 1 46

Example (3) • Consider another set of schedules: s 1: r 1(I 1) w 1(I 1+1) r 1(a) c 1 r 2(I 1) w 2(I 1+1) w 2(a) c 2 s 2: r 3(b) r 2(I 2) w 2(I 2+1) r 1(I 2) w 1(I 2+1) w 1(b) c 1 r 2(c) c 2 w 3(c) c 3 Now, both schedules are conflict serializable – tickets obtained by transactions determine their serialization order 47

Optimistic ticket method • Optimistic ticket method (OTM): GTM must ensure that the subtransactions have the same relative serialization order in their corresponding EDSs • Idea: is to allow the subtransactions to proceed but to commit them only if their ticket values have the same relative order in all participating EDSs • Requirement: EDSs must support a visible ‘prepare_to_commit’ state for all subtransactions • ‘Prepare_to_commit’ state is visible if the application program can decide whether the transaction should commit or abort 48

Optimistic ticket method • A global transaction T proceed as follows: – – GTM sets a timeout for T Submits all subtransactions of T to their corresponding EDSs If they enter their ‘p_t_c’ state, they wait for the GTM to validate T Commit or abort is broadcasted • GTM validates T using Ticket graph – the graph is tested for cycles involving T • Problems with OTM – Global aborts caused by ticket operations – Probability of global deadlocks increases 49

Cache Coherence and Concurrency Control for Data. Sharing Systems

Architectures for Parallel Distributed Database Systems • Three main architectures: – Shared memory systems – Shared disk systems – Shared nothing • Shared memory system: multiple CPUs are attached to an interconnection network, and can access a common region of main memory • Shared disk system: each CPU has a private memory and direct access to all disks through an interconnection network • Shared nothing system: each CPU has local memory and disk space, but no two CPUs can access the same storage area, all communication is through a network connection 51

Shared memory system P P Interconnection Network Global Shared Memory D D D 52

Shared disk system M M P P Interconnection Network D D D 53

Shared nothing system Interconnection Network P P M M D D 54

Characteristic of architectures • Shared memory: – is closer to conventional machine, many commercial DBMS have been ported to this platform – Communication overhead is low – Memory contention becomes a bottleneck as the number of CPUs increases • Shared disk: similar characteristic • Interference problem: as more CPUs are added, existing CPU’s are slowed down because of the increased contention for memory access and network bandwith • A system with 1000 CPU is only 4% as effective as a single CPU system 55

Shared nothing • It provides almost linear speed-up in that the time taken for operations decreases in proportion to the increase in the number of CPUs and disks • It provides almost linear scale-up in that performance is sustained if the number of CPUs and disks are increased in proportion to the amount of data • Powerful parallel database systems can be built by taking advantage of rapidly improving performance for single CPU 56

Shared nothing # transactions/second # of CPUs SPEED-UP # of CPUs and DB size SCALE-UP with DB SIZE 57

Concurrency and cache coherency problem • Data pages can be dynamically replicated in more than one server cache to exploit access locality • Synchronization of reads and writes requires some form of distributed lock management and invalidation of stale copies of data items or propagation of updated data items must be communicated among the servers • Basic assumption for data sharing systems: each individual transaction is executed solely on one server (i. e. transaction does not migrate among servers during its execution) 58

Callback Locking • We assume that both concurrency control and cache coherency control are page oriented • Each server has a global lock manager and a local lock manager • Data items are assigned to global managers in a static manner (e. g. via hashing), so each global lock manager is responsible for a fixed subset of the data items – we say that global lock manager has the global lock authority for a data item • The global lock manager knows for a data item at each point in time whether the item is locked or not 59

Callback Locking - concurrency control • When a transaction requests a lock or wants to release a lock, it first addresses its local lock manager, which can then contact the global lock manager • The simplest way is to forward all lock and unlock requests to the global lock manager that has the global lock authority for the given data item • If a lock manager is authorized to manage read lock (or write lock) locally, then it can save message exchanges with the global lock manager 60

Callback Locking – concurrency control • Local read authority enables local lock manager to grant local read locks for a data item • Local write authority enables local lock manager to grant local read/write locks for a data item • A write authority has to be returned to the corresponding global lock manager if another server wants to access the data item • A read authority can be held by several servers simultaneously and has to be returned to the corresponding global lock manager if another server wants to access the data item to perform a write access 61

Callback Locking – concurrency control • Cache coherency protocol needs to ensure: – Multiple caches can hold up-to-date versions of a page simultaneously as long as the page is only read, and – Once a page has been modified in one of the caches, this cache is the one that is allowed to hold a copy of the page • Callback message revokes the local lock authority 62

Callback Locking Home(x) Rlock(x) Server A Server B Server C r 1(x) Rlock authority(x) r 2(x) Rlock authority(x) c 1 r 3(x) c 3 w 4(x) 63

Callback Locking Home(x) Server A Server B c 1 r 3(x) c 3 Server C w 4(x) Wlock(x) Callback(x) OK OK c 2 Wlock authority(x) 64