Middleware Enabled Data Sharing on Cloud Storage Services
Middleware Enabled Data Sharing on Cloud Storage Services Jianzong Wang Rice University USA Peter Varman Rice University USA Changsheng Xie HUST China Presentation by Qi Huang (HUST, Cornell) 1
Cloud Storage Overview Internet Cloud Storage service hosted in Cloud • Large deployment -> Global accessibility • Cost efficient • Featured functions (Protection/Backup) 2
Wish list of Storage System • Easy access • Cheap cost • Back-up protection More… • • Provides large capacity Sustains for long-term usage Achieves high throughput Provides sharing capability 3
Amazon: Existing Solutions Simple Storage Service (S 3) large capacity, high availability, sharable, but performance is various Elastic Block Storage (EBS) moderate capacity, high performance with small variation, but not sharable 4
Solution • A middleware layer – Provides a scalable storage service to clients – Allows storage volumes to be shared among multiple clients running simultaneously on multiple VMs – Provides high performance and less variation of this performance 5
Basic Ideas • m. Cloud, – Combines S 3, EBS and ELD to serve storage – Enables sharing of multiple EBS volumes among multiple EC 2 VMs – Supports fast and transparent data migration between S 3 and EBS – Incorporates others strategies to improve performance • Layered cache • Data chunking • Fair IO scheduling 6
Contributions • Give a fresh way to identify and address the problems in performance in cloud storage. • Various topological structures for data sharing on clouds have been investigated in m. Cloud using data-intensive applications and benchmarks. • We show potential schemas, for instance data placement, data chunking, and IO scheduling strategies, that can be integrated into m. Cloud to provide performance SLAs for cloud storage services. 7
Talk Overview • System Architecture • Data Sharing Approaches • Evaluations • Conclusion and Future Work 8
System Architecture 9
A Simple Sharing Method Limitations: 1. Data transfer to and from S 3 is slow 2. Consumption of EBS grows even further when sharing multiple volumes. 10
Data Sharing Approach Improvements: 1. Sharing happens at ELD and EBS, performs better 2. Consumption of EBS grows only with more storage 11
Evaluations: Basic Performance Testbed Configuration Takeaways: 1. Performance is stable till EBS level (among EC 2) 2. Out of EC 2, throughput becomes unstable and bad 12
Evaluations: Scaling number of EBS/EC 2 for single size file Total throughput scales with EBS/EC 2 13
Evaluations: Scaling number of EBS/EC 2 for application file settings Scaling writes perform better than read 14
Evaluations: Chunking Performance Average throughput increases along with number of chunks 15
Conclusions and Discussion • Hybrid Cloud Storage Architecture – How to group the optimization architecture to provide better storage services. And import the DHT (Distribute Hash Table) to maintain the metadata. • IO Scheduling – The switcher can control the IO to make the system load balance and avoid the performance burst. • Optimization Cloud Storage Medium – Key-Value design may not the best one. It is possible to bring out the new ones. 16
Thank You Please be free to email: jw 19@rice. edu 17
- Slides: 17