- Slides: 17
Embedded System Lab. Janus: Optimal Flash Provisioning for Cloud Storage Workloads C. Albrecht, A. Merchant, M. Stokely, M. Waliji, F. Labelle, N. Coehlo, X. Shi, and C. E. Schrock. In Proceedings of the annual conference on USENIX Annual Technical Conference, ATC ’ 13, Berkeley, CA, USA, 2013. USENIX Association. Jung Young. Jin cyj [email protected] com
정 영 진 Embedded System Lab.
Contents l l l l Introduction System description Workload characterization Economics and provisioning Optimizing the flash allocation for workloads Optimization with bounded write rates Evaluation Conclusion 정 영 진 Embedded System Lab.
Introduction l HDD & SSD £ £ l Disks are slow, even as their capacities grow We can compensate for this by adding flash storage Large cloud environment £ £ £ 정 영 진 Many user Many workload Distributing the available flash capacity uniformly between the workloads is not ideal Embedded System Lab.
Introduction l Janus? £ £ l Provides flash storage allocation recommendations for workloads in a distributed file system Used in distributed file system, GFS, Colosus … Google workload £ Analyzed workload characterizations l l £ 정 영 진 Most I/O accesses -> recently created files 28% of read operations -> 1% data Files are placed in the flash upon creation Embedded System Lab.
System description l Recommendation £ £ £ Runs periodically to adjust Many read operation -> flash storage Key input l l l Age of data Read rate of the data by age Colosus or GFS Hybrid Storage Janus work step £ £ £ 정 영 진 Collect age of data and characterization of how cacheable each workload Allocate flash amongst the workloads Coordination with the distributed file system Embedded System Lab.
Workload characterization l Workload 정 영 진 £ A large application have many job £ Need to define a metric that lets us compare how many read operations would be served Embedded System Lab.
Workload characterization l Cacheability functions £ FIFO eviction instance l l £ LRU eviction instance l l 정 영 진 How much data there is of a given age How many reads there are to files of a given age Amount of data with a given temporal locality Rate of reads to files with that temporal locality(time gap) SIGELMAN, B. H. , ET AL. Dapper, a large-scale distributed systems tracing infrastructure. rep. , Google, Inc. , 2010. Embedded System Lab.
Workload characterization £ Obtaining instance l l £ Function input/output l l 정 영 진 From file system metadata From trace sample Input : size of data Output : the number of read operations Embedded System Lab.
Economics and provisioning l Peak IOPS and capacity requirements Cost effective to put workloads entirely in flash 정 영 진 Cost effective to hot portions of the data on flash Embedded System Lab.
Optimizing the flash allocation for workloads l l Determine the best flash allocation for each workload Primary goal £ Find maximize the aggregate rate of read operations £ Instance l l £ Task l 정 영 진 Workloads with cacheability function Total flash capacity Allocate flash to workloads to maximize the weighted flash read rate Embedded System Lab.
Optimization with bounded write rates l Secondary goal £ Bound the flash write rate to reduce flash wear £ Instance l l l £ Task l 정 영 진 Workloads with cacheability function and write rate Bound on the flash write rate Total flash capacity Allocate flash to workloads and determine write probability for each workload to maximize the flash read rate Embedded System Lab.
Evaluation l Flash hit rate during training 정 영 진 Embedded System Lab.
Evaluation l Flash usage and flash read rate for one workload over time 정 영 진 Embedded System Lab.
Evaluation l Comparison of flash hit rates for alternative allocation methods Allocation Method Cell A (low workload variance) Cell B (high workload variance) Optimized 28% 74% Proportional to read rate 26% 64% Single FIFO 19% 42% Proportional to data size 14% 15% 정 영 진 Embedded System Lab.
Reference l 2013 USENIX Annual Technical Conference, Presentation video 정 영 진 Embedded System Lab.
Q&A 정 영 진 Embedded System Lab.