BASIC Regenerating Codes for Distributed Storage Systems Kenneth
BASIC Regenerating Codes for Distributed Storage Systems Kenneth Shum (Joint work with Minghua Chen, Hanxu Hou and Hui Li)
Window Azure data centers Aug 2013 kshum 2
http: //technoblimp. com Inside a data center Aug 2013 kshum 3
Data distribution • Encode and distribute a data file to n storage nodes. Data File: “INC” Aug 2013 kshum 4
Data collector • Data collector can retrieve the whole file by downloading from any k storage nodes. “INC” Aug 2013 kshum 5
Three kinds of disk failures • Transient error due to noise corruption – repeat the disk access request • Disk sector error – partial failure – detected and masked by the operating system • Catastrophic error – total failure due to disk controller for instance – the whole disk is regarded as erased Aug 2013 kshum 6
Frequency of node failures Figure from “XORing elephants: novel erasure codes for Big Data” by Sathiamoorthy et al. Aug 2013 Number of failed nodes over a single month in a 3000 node production cluster of Facebook. 7
Outline of this talk • Repetition scheme • Traditional erasure-correcting codes – Reed-Solomon codes • Network-coding-based scheme – BASIC regenerating codes Aug 2013 kshum 8
Distributed storage system • Encode a data file and distribute it to n disks • (n, k) recovery property – The data file can be rebuilt from any k disks. • Repair – If a node fails, we regenerate a new node by connecting and downloading data from any d surviving disks. – Aim at minimizing the repair bandwidth (Dimakis et al 2007). • A coding scheme with the above properties is called a regenerating code. Aug 2013 kshum 9
Repetition scheme • GFS: Replicate data 3 times • Gmail: Replicate data 21 times Aug 2013 kshum 10
2 x Repetition scheme Divide the data file into 2 parts A, B 1 G 1 G Aug 2013 A B Data Collector A B Cannot tolerate double disk failures 11
Repair is easy for repetition-based system New node A A B 1 G A Repair bandwidth =1 G B Aug 2013 12
Reed-Solomon Code Divide the file into 2 parts A A, B B Data Collector A+B A+2 B Aug 2013 It can tolerate double disk failures 13
Repair requires essentially decoding the whole file A A New node 1 G B 1 G A+B Repair bandwidth = 2 G A+2 B Aug 2013 kshum 14
BASIC regeneration code Divide the data file into 4 parts 0. 5 G Aug 2013 Binary Addition Shift Implementable Convolutional Utilization of bit-wise shift in storage was proposed by Piret and Krol (1983), and Qureshi, Foh and Cai (2012). 15
Download from nodes 1 and 2 1 G 0. 5 G 1 G Data Collector Aug 2013 16
Download from nodes 1 and 3 1 G 0. 5 G Data Collector 1 G Aug 2013 17
Download from nodes 1 and 4 1 G 0. 5 G Aug 2013 Data Collector 1 G 18
Download from nodes 2 and 3 0. 5 G 1 G Data Collector 1 G Aug 2013 19
Download from nodes 2 and 4 1 G 0. 5 G Aug 2013 Data Collector 1 G 20
Download from nodes 3 and 4 0. 5 G Aug 2013 1 G Data Collector 1 G 21
Zigzag decoding à la Gollakata and Katabi (2008) What to solve for P 1 and P 2. P 1 P 2’ P 1 P 2’ Aug 2013 kshum 22
Repair of BASIC regenerating code New node XOR Repair bandwidth=1. 5 G Bitwise shift and XOR
Repair of BASIC regenerating code Decode the blue and red packets by zigzag decoding Interference alignment
Comparison of the three examples Repetition scheme Reed-Solomon Codes BASIC regenerating codes Storage efficiency 1/2 1/2 Reliability Tolerate one disk failure Tolerate two disk failures Repair bandwidth 1 G 2 G 1. 5 G Finite field arithmetic Binary addition and bit-wise shift Computational Very small complexity Aug 2013 kshum 25
Summary • We can reduce repair bandwidth by network coding. • BASIC regenerating codes – A failed storage node can be repaired by simple bit -wise shift and XOR operations. – Small storage overhead due to shifting. Aug 2013 kshum 26
References • Piret and Krol, MDS convolution codes, IEEE Trans. of Information Theory, 1983. • Dimakis, Brighten, Wainwright and Ramchandran, Network coding for distributed storage systems, INFOCOM, 2007. • Gollakata and Katabi, Zigzag decoding: combating hidden terminals in wireless networks, Proc. in the ACM Sigcomm, 2008. • Qureshi, Foh, and Cai, Optimal solution for the index coding problem using network coding over GF(2), Proc. IEEE Conf. on Sensor Mesh and Ad Hoc Comm. and Network, 2012. • Sung and Gong, A zigzag decodable code with MDS property for distributed storage systems, Proc. IEEE Symp. on Information Theory, 2013. • Hou, Shum, Chen and Li, BASIC regenerating code: binary addition and shift for exact repair, Proc. IEEE Symp. on Information Theory, 2013. Aug 2013 kshum 27
Two modes of repair • Exact repair – The content of the new node is exactly the same as the content of the failed node • Functional repair – only requires that the (n, k) recovery property is preserved. Aug 2013 kshum 28
- Slides: 28