CMPE 252 A Computer Networks Chen Qian Computer

Midterm Exam next week q Mainly focus on Lectures 2, 3, 4, 5, 7,

Forward Error Correction using Erasure Codes q L. Rizzo, “Effective Erasure Codes for Reliable

Erasure Codes q Erasures are missing packets in a stream v Uncorrectable errors at

Applications of Erasure Codes q When you don’t know which packet will be lost

Erasure Codes q Two data items, x and y, with the same length q

An Ensemble of Replication and Erasure Codes for CFS Ma, Nandagopal, Puttaswamy, Banerjee

a single failure • a DC is completely inaccessible to the outside world. •

k-resiliency • data is accessible even in the event of simultaneous failures of k

Cloud q. Amazon S 3 and Google FS provide resiliency against two failures, using

All. Replica q Given any k, store k+1 full copies of each file in

Cloud q To reduce storage cost, cloud file systems (CFSs) are transitioning from replication

Allcode qdivide an object into multiple data chunks (say m data chunks), and use

All. Code q Upon a read request, • if a full copy of the

All. Code • For the first write operation after file-open, the primary DC contacts

All. Code q efficient because instead of storing whole copies, it stores chunks and

Allcode q When erasure codes are applied to CFSs , they lead to significantly

Goal q provide an access latency close to that of the All. Replica scheme.

DESIGN OVERVIEW q Consistency q Data accesses

Consistency q replication-on-close consistency model. as soon as a file is closed after a

Consistency • In a cloud, this requires that the written parts of the file

Assumption q The authors assume that all requests are sent to a primary DC

CAROM q Cache A Replica On Modification uses a block-level cache at each data

Observation • the same block is often accessed in a very short span of

Read request • For a read request, retrieve requested data blocks from data chunks

Write • When a write is performed, two operations in parallel: 1. ) the

File-close q redundant chunks are recalculated using erasure code and the updated chunks are

Hybrid • store one complete primary copy of the file, and the backup copies

Summary • Storage: save cost up to 60% over replication schemes. • bandwidth :

Trace details • Use CIFS network trace to collect from two larger-scale, enterprise-class file

Table total of 3. 8 million unique files representing over 35 TB of data

Monetary cost q Cost: • (1) the cost of storing the data for a

Access latency q hit : data is available at the local disk in the

Access latency CAROM provides read access latencies that are close to those of All.

Slides: 48

Download presentation

CMPE 252 A : Computer Networks Chen Qian Computer Engineering UCSC Baskin Engineering Lecture 15 1

Midterm Exam next week q Mainly focus on Lectures 2, 3, 4, 5, 7, 9. q Also covers the very basic material of other lectures. q One-side paper allowed. 1 -2

Forward Error Correction using Erasure Codes q L. Rizzo, “Effective Erasure Codes for Reliable Computer Communication Protocols, ” ACM SIGCOMM Computer Communication Review, April 1997 3

Erasure Codes q Erasures are missing packets in a stream v Uncorrectable errors at the link layer v Losses at congested routers q (n , v k) code k blocks of source data are encoded to n blocks of encoded data, such that the source data can be reconstructed from any subset of k q encoded blocks v each block is a data item which can be operated on with arithmetic operations 4

Erasure Codes q Erasures are missing packets in a stream v Uncorrectable errors at the link layer v Losses at congested routers q (n , k) code k blocks of source data are encoded to n blocks of encoded data, such that the source data can be reconstructed from any subset of k encoded blocks v each block is a data item which can be operated on with arithmetic operations v 5

q k fixed-length packets 6

Applications of Erasure Codes q When you don’t know which packet will be lost q Particularly good for large-scale multicast of long files or packet flows v Different packets are missing at different receivers – the same redundant packet(s) can be used by (almost) all receivers with missing packets 7

Erasure Codes q Two data items, x and y, with the same length q x⊕y (exclusive or) is in a same length of x or y. q If you know any two of (x, y, x⊕y ), you can recover x and y. 8

An Ensemble of Replication and Erasure Codes for CFS Ma, Nandagopal, Puttaswamy, Banerjee

Amazon S 3

a single failure • a DC is completely inaccessible to the outside world. • In reality, data stored in a particular DC is internally replicated (on different networks in the same DC) to tolerate equipment failures.

k-resiliency • data is accessible even in the event of simultaneous failures of k different duplicates.

Cloud q. Amazon S 3 and Google FS provide resiliency against two failures, using All. Replica model. v a total of three copies of each data v item are stored

All. Replica q Given any k, store k+1 full copies of each file in the system. q One copy is stored in a primary DC and k other copies are stored in k backup DCs. The primary DC serves read and write requests directly. • Upon write operations, k backup DCs are updated on file close.

Cloud q To reduce storage cost, cloud file systems (CFSs) are transitioning from replication to erasure codes q. Using allreplica, we need 6 data items for x and y to reach 2 -resiliency q. How about x, x, x⊕y, y, y.

Erasure code-Cloud

Allcode qdivide an object into multiple data chunks (say m data chunks), and use encoding schemes to produce k redundant chunks. q both data chunks and redundant chunks, are stored in different locations. q Only m out of m + k chunks are necessary to reconstruct the object.

All. Code q Upon a read request, • if a full copy of the file has already been reconstructed after file-open, the primary DC serves the read directly. • Otherwise, the primary DC contacts other backup DCs to get requested data and then it serves read.

All. Code • For the first write operation after file-open, the primary DC contacts other m− 1 backup DCs to reconstruct the whole file while it performs write on the file. • The full copy is kept until file-close and subsequent read/write can be served directly from this copy until file closes. • The primary DC updates the modified data chunks and all the redundant chunks on file-close

All. Code q efficient because instead of storing whole copies, it stores chunks and thus reduces the overall cost of storage. q requires reconstructing the entire object and updating all the redundant chunks upon write requests.

Allcode q When erasure codes are applied to CFSs , they lead to significantly high access latencies and large bandwidth overhead between data centers.

Goal q provide an access latency close to that of the All. Replica scheme. q and storage effectiveness close to that of the All. Code scheme.

DESIGN OVERVIEW q Consistency q Data accesses

Consistency q replication-on-close consistency model. as soon as a file is closed after a write, the next file open request will see the updated version of the file. Ø Very Hard in Cloud

Consistency • In a cloud, this requires that the written parts of the file be updated before any other new file open requests are processed. • May cause large delays.

Assumption q The authors assume that all requests are sent to a primary DC that is responsible for that file. • All file operations are essentially serialized via the primary DC, thus assuring such consistency.

CAROM q Cache A Replica On Modification uses a block-level cache at each data center. “Cache” here refers to local file accesses that can be in a hard-disk or even DRAM at a local DC. • Each DC acts as the primary DC for a set of files in the CFS. When files are being accessed, accesses are sent to this primary DC. • Any block accessed for the first time is immediately cached.

Observation • the same block is often accessed in a very short span of time. • caching reduce access latency of successive operations on the same data block.

Read request • For a read request, retrieve requested data blocks from data chunks and cache them in the primary DC. • If a subsequent read request arrives for those cached blocks, then use the cached data to serve this request, without retrieving from the data chunks.

Write • When a write is performed, two operations in parallel: 1. ) the written blocks are stored in the cache of the primary DC and the write is acknowledged to the user. 2. ) reconstruct the entire file from any m out of m+k chunks and update it in cache.

File-close q redundant chunks are recalculated using erasure code and the updated chunks are written across the other DCs. • However, still keep the entire file in cache even after the update is complete, until Cache is full. Because the same file can be accessed again

Cache size

Hybrid • store one complete primary copy of the file, and the backup copies are split using a (m + k, m) erasure code. • the primary replica answer the read/write queries from users. The chunks are mainly used to provide resiliency. • Upon write requests, all modified data chunks plus redundant chunks are updated on fileclose as in All. Code.

Summary • Storage: save cost up to 60% over replication schemes. • bandwidth : save cost up to 43% over erasure code based schemes

Trace details • Use CIFS network trace to collect from two larger-scale, enterprise-class file servers deployed in the Net. App. • One file server is deployed in the corporate DC that hosts data used by over 1000 marketing, sales, and finance employees, • another is deployed in the engineering DC used by over 500 engineering employees.

Table total of 3. 8 million unique files representing over 35 TB of data

Storage-time

Bandwidth cost

Monetary cost q Cost: • (1) the cost of storing the data for a month in terms of GBs. • (2) the cost of I/O operation, and the cost of the bandwidth while transferring data in and out of the data centers.

Cost

Cache size & Cost

Access latency q hit : data is available at the local disk in the primary DC. q miss : data has to be fetched from a backup DC q (A, B) denote local disk access latency (A) and remote disk access latency (B) in units of milliseconds,

Access latency CAROM provides read access latencies that are close to those of All. Replica (local disk access latencies) , rather than those of All. Code (remote disk access latencies).

End 9/14/2021 48