AEGEAN REPLICATION BEYOND THE CLIENTSERVER Remzi Can Aksoy

AEGEAN: REPLICATION BEYOND THE CLIENT-SERVER Remzi Can Aksoy, Manos Kapritsos SOSP 2019

Paper contributions n Current consistency models for replicated data fail when transactions involve non-replicated third parties ¨ Credit card payment subsystem n Server-shim, response-durability and spec-tame ¨ Contribute to provide a single correct machine abstraction n Request pipelining ¨ Achieves replica consistency without sequential execution

The problem n Many replicated services interact with non-replicated services ¨ Nested transaction Client n Replicated service Credit card svc (non-replicated) Current protocols are both inefficient and incorrect

Why? n Extant replication protocols ¨ Use State Machine Replication model ¨ Provide clients with the illusion they are interacting with a single correct machine (SCM) ¨ Do not provide same abstraction to other external entities n !!!

Performance issues n n Extant replication protocols require requests to be executed in a well-defined order ¨ Each request must finish executing before the next request can start Servers must remain idle while one of their requests is processed by another service ¨ Limits throughput

Client-middle-backend model Client Middle Backend

Middle service replicated with primary-backup n n What if primary sends a nested payment request then crashes? ¨ Backend service will process the request n Not reflected in the state of the middle service ¨ When backup copy becomes the new primary n Will not know that the request was sent and processed n Will reissue it! Problem is that primary performs an OUTPUT commit without first ensuring that request was propagated o the backup

Middle service replicated with Paxos-like protocols n n Paxos, Raft and many others All replicas execute all requests ¨ Backend may end receiving and executing multiple copies of each nested request n Will be identical n Duplicate detection will work n Problem still remain

Middle service replicated with speculative execution n Some replicated services execute requests before an agreement is reached ¨ Faster ¨ Roll back if agreement cannot be reached n Works well as long as the client is not exposed to inconsistent states resulting from speculations that failed ¨ What about backend services?

The issues n Replicated requests should be treated as a single logical request n Replicas cannot finalize execution of backend responses before obtaining a consensus from a quorum of their peers n Nested request should never be based on a speculative state

An alternative solution n Make the middle service ¨ Unreplicated ¨ Stateless n In case of crashes, can restart another instance of the server n Typically combined it with a fault-tolerant backend store

System model (I) n Middle servers ¨ Can be synchronous or asynchronous ¨ Can be designed to tolerate all kind of failures n Including Byzantine failures ¨ We assume synchronous intervals during which messages sent by correct nodes are received and processed n Required for liveness

Refresher n Non-Byzantine ¨ Failed nodes stop communicating with other nodes n Omission failures n Fail-stop behavior n Byzantine ¨ Failed nodes will keep sending messages n Incorrect and potentially misleading n Failed node becomes a traitor

System model (II) n Backend service ¨ Can be replicated or unreplicated

System model (III) n Failure model ¨ System will remain live n Provide replies to client requests n In the presence of up to u simultaneous failures of any kind ¨ System will remain safe (but not necessarily live) n Not provide incorrect replies to its clients n In the presence of up to r simultaneous Byzantine failures and an arbitrary number of omission failures

System model (IVI) n Correctness ¨ Does not want to use serial execution n Too limiting ¨ Undistinguishability n A replicated service is correct if its outcomes are undistinguishable from those of an unreplicated service

Server shim (I) n n Interposed between middle service and backend service When middle service issue a request ¨ Server shim authenticates the request ¨ Waits until request gets a quorum of matching requests before forwarding it to the backend service ¨ Discards request copies sent after that

Server shim (II) n When it gets a response from the backend service, shim broadcasts it to all replicas ¨ Must ensure that no message is lost in the network n Maintains a per-client-thread reply cache n Resends response if needed

Durability of nested responses n

Spec-tame n Allows replicated services to use speculation while still providing single correct machine abstraction to all backend service n Key idea ¨ Not to perform any output commit until speculation is resolved

Case study: taming Eve’s speculation n Not covered

Agree-execute architecture n Used by most replication protocols n Replicas first agree on ordering of requests, then execute them sequentially in that order n Forces middle service to remain idle while its nested transactions are forwarded to and processed by backend service ¨ Single-threaded servers performing physical I/Os have the same issue n Main motivation for multithreaded servers

An example

Request pipelining (I) n

Request pipelining (II) n After the replies to the nested requests have arrived, the requests that were waiting are processed in the pipeline order ¨ Not in the order the nested replies have arrived

Example

Implementation n Not covered

Experimental results n Multithreading works!

Conclusion n From the paper ¨ In a world of large-scale systems and microservices, it becomes imperative to rethink our replication protocols to allow for interactions between multiple components.