Analysis of and Optimization for Writedominated Hybrid Storage
Analysis of and Optimization for Write-dominated Hybrid Storage Nodes in Cloud Liu, Shuyang ; Wang, Shucheng ; Cao, Qiang ; Lu, Ziyi ; Jiang, Hong ; Yao, Jie ; Dong, Yuanyuan ; Yang, Puyuan So. CC '19: Proceedings of the ACM Symposium on Cloud Computing November 2019
Introduction(1/4) Hybrid storage nodes play a critical role in providing high performance and low cost for cloud providers. However, the behaviors of these nodes are not fully understood in real production clouds. In this paper, we analyzed real production traces from Alibaba Pangu , and found that some hybrid storage nodes have write-dominated workload behaviors. storage node Pangu cloud storage
Introduction (2/4) is a hyper-scale and distributed cloud storage system for Alibaba. Mapping table Chunk ID … file Chunk ID divide into chunk …
Introduction (3/4) • To take better tradeoff between performance and cost for different applications, Pangu generally deploys Chunk Servers with different types and numbers of SSDs and HDDs. • SSD as write buffer (SSD Write Back, SWB mode) (1) First write incoming data into SSD (2) Then dump them into HDD in the background
Introduction (4/4) ● ● WSNs: Chunk Servers in Pangu experience a write-dominant workload behavior. Feature: 1. 77%-99% of requests are writes. 2. The amount of data written is much larger than data read. ● Reason: Frontend applications with their own cache layers need rapidly flush all writes into Pangu and reserve their local storage for hot data.
Trace Analysis (1/8) Trace Analysis Summary ● WSNs Problems according to trace analysis on Pangu production traces Ø SSD overuse Ø Long-tail write latency Ø Low utilization of HDD
Trace Analysis (2/8) Workload Traces Three Business Zones: A(Cloud Computing), B(Cloud Storage), C(Structured Storage). ● ● Nodes: A 1, A 2, B, C 1, C 2 ● Time duration: 0. 5 -22 hour ● ● SSD ratio: 1 Low(<10%), 2 Mid(10%-33%), 2 High(>33%) Write request ratio: 77. 2%-99. 3%
Trace Analysis (3/8) Load Intensity across Chunk Servers ● Load Intensity varying over time The request intensity are high
Trace Analysis (4/8) Load Behaviors across Disks within Chunk Servers ● Load balancing across internal disks The results show that : Both the request and written data intensities are roughly equal across either SSDs or HDDs during this period.
Trace Analysis (5/8) Operation type and Proportion
Trace Analysis (6/8) Problem 1 : SSD overuse ● The amount of data written to/read from SSD/HDD in 24 hours. ● Calculating an SSD’s lifespan in B node ○ 500 GB, 300 TBW(Terabyte written), 3 TB (DWPD) ○ Lifespan=300 TB/30=3. 3 month ● SSDs wear out quickly in the write dominated behavior ● Limit DWPD but increase the number of SSDs
Trace Analysis (7/8) Problem 2: Long Tail Latency ● External SSD-write: Peak latency is 100 -300 x larger than average latency. ● Internal SSD-write: Peak latency is 90 -2000 x larger than average latency. ave/max
Trace Analysis (8/8) Problem 3: Low Utilization of HDD ● In A 1, the amount of data written by SSD-write is 1380 x larger than HDD-write. ● The HDD utilization in A 1 is far less than 0. 1% on average, while the maximum is 14. 3%.
Design of SWR (1/3) Architecture Of SWR ● SSD Write Redirect (SWR), a runtime IO scheduling mechanism for WSNs. ● Relieve SSD write pressure by leveraging HDDs while ensuring Qo. S When scheduler determines “Yes”, but all HDDs have waiting requests => Paused the redirected and SWB.
Design of SWR (2/3) Key Parameters Idea: redirects large SSD-writes to an idle HDD (1) S: When a request’s size exceeds S, it will be redirected. (2) Smax: Initial value of S. (3) L: When SSD queue length exceeds L, S will be decreased. (4) p: SWR gradually decreases the size threshold S with a fixed step value p.
Design of SWR (3/3) Redirecting Strategy Set L = 10 , p = 0. 1 Set S = Smax = 10 Step 1. request’s size > S ? 8 15 815< >1010 Step 2. Y -> put in HDD queue N -> step 3 No Yes Step 3. Y -> put in SSD queue S=9
Evaluation(1/4) Experiment Setup ● Two types of SSDs: • A 1, A 2: a 256 GB Intel 600 p SATA with 0. 6 GB/s peak writes • B, C 1, C 2: a 256 GB Samsung 960 EVO NVMe-SSD with 1. 1 GB/s peak writes ● HDD: 4 TB Seagate ST 4000 DM 005 HDD with 180 MB/s peak write
Evaluation(2/4) SSD-write Reduction
Evaluation(3/4) Parameters Selection
Evaluation(4/4) Average Write Latency
Conclusion ● Some hybrid storage nodes in Pangu have write-dominated workload behaviors. ● Current request serve mode in such nodes leads to SSD overuse, long-tail latency, and HDD low-utilization. ● Redirecting large SSD write requests to HDDs and dynamically optimize for small and intensive burst requests.
- Slides: 21