Unistore A Unified Storage Architecture for Cloud Computing

  • Slides: 15
Download presentation
Unistore: A Unified Storage Architecture for Cloud Computing Project Members: Wei Xie, Dr. Jiang

Unistore: A Unified Storage Architecture for Cloud Computing Project Members: Wei Xie, Dr. Jiang Zhou, Dr. Yong Chen Presented by Wei Xie Data-Intensive Scalable Computing Laboratory(DISCL) Computer Science Department Texas Tech University CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University

Unistore: review of project plan • Based on sheepdog distributed store for virtual machine

Unistore: review of project plan • Based on sheepdog distributed store for virtual machine • Optimization for heterogeneous storage (SSDs and HDDs) • Optimization for heterogeneous workload Characterization Component Workloads Access patterns Devices Bandwidth Throughput Block erasure Concurrency Wear-leveling Data Placement Component I/O Pattern Random/Sequential Read/write Hot/cold API Write_to_SSD Read_from_SSD Write_to_HDD Placement Algorithm Modified Consistent Hash CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University

Planned Schedule p p p p 2015, Q 1: investigation and survey about Unistore

Planned Schedule p p p p 2015, Q 1: investigation and survey about Unistore 2015, Q 2: characterization component development of Unistore 2015, Q 3: metadata management of Unistore 2015, Q 4: data distribution management of Unistore 2016, Q 1: VM image store and loading 2016, Q 2: advanced functions of Unistore 2016, Q 3: performance optimization of Unistore 2016, Q 4: module integration and system benchmarking finished on-going to-be-done CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University

Sheepdog: component • Cluster manager • QEMU block driver • Object storage • Gateway

Sheepdog: component • Cluster manager • QEMU block driver • Object storage • Gateway • Object manager CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University

Sheepdog: object storage object type What it contains data object Actual vdi data vdi

Sheepdog: object storage object type What it contains data object Actual vdi data vdi object Metadata of image(name, size, data object ids, and etc) vmstate object State info(used for snapshot) vdi attr objects Extended attributes struct sd_inode { char name[SD_MAX_VDI_LEN]; /* the name of this VDI*/ uint 64_t ctime; /* creation time of this VDI */ uint 64_t vdi_size; /* the size of VDI */ uint 64_t vm_state_size; /* the size of vm state (used for live snapshot) */ uint 8_t nr_copies; /* the number of object redundancy */ uint 32_t data_vdi_id[MAX_DATA_OBJS]; /* the data object IDs this VDI contains*/ …… }; CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University

Sheepdog: gateway p p p Responsible for where to store objects, or data placement

Sheepdog: gateway p p p Responsible for where to store objects, or data placement Consistent hashing Add/remove node not significantly change mapping I/O load balance How to make the consistent hashing support heterogonous device? Two hash rings for HDD and SSD, respectively write Write_to_HDD Write_to_SDD CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University

Deployment p 3 Cent. OS 6. 5 virtual machines on i. Mac workstation p

Deployment p 3 Cent. OS 6. 5 virtual machines on i. Mac workstation p Sheepdog built on the 3 virtual machine and form a cluster p Use corosync to manage the cluster (can switch to zookeeper if necessary) p Will migrate to a real Linux cluster later (for testing) CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University

Benchmarking p Planned testing tools: p fio, dd for generating synthetic workload Real workload

Benchmarking p Planned testing tools: p fio, dd for generating synthetic workload Real workload benchmark iostat Comparison with other product Gluster. FS Ceph CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University

Characterization Component p Workload characterization Hot/cold data detection and separation p Multiple bloom filter

Characterization Component p Workload characterization Hot/cold data detection and separation p Multiple bloom filter [1] p Temporal locality [2] Spatial locality analysis (is it more random or sequential? ) I/O size, write/read ratio, inter-arrival time, queue depth, latency, and IOPS. Online I/O trace collection for off-line analysis Online analysis [3] [1] Park, Dongchul, and David HC Du. "Hot data identification for flash-based storage systems using multiple bloom filters. " Mass Storage Systems and Technologies (MSST), 2011 IEEE 27 th Symposium on. IEEE, 2011 [2] Wei Xie, Yong Chen, and Philip Roth, A Low-cost Adaptive Data Separation Method for the Flash Translation Layer of Solid State Drives, to be published. [3] Easy and Efficient Disk I/O Workload Characterization in VMware ESX Server CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University 9

Initial Characterization Result • Plan to implement online hot/cold data detection • Hot data

Initial Characterization Result • Plan to implement online hot/cold data detection • Hot data store on SSD, cold on HDD • Initial result collected from OLTP I/O trace CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University 10

CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University 11

CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University 11

Backup slides CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University

Backup slides CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University

Sheepdog: strong consistency with epoch p p p To keep replicated data consistent, epoch

Sheepdog: strong consistency with epoch p p p To keep replicated data consistent, epoch is used. Keep history of node member ship Check epoch before read/write epoch Node membership 1 2 3 A, B, C A CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University

Sheepdog: a read/write request p p default_write(oid, iocb){ … //check if epoch in iocb

Sheepdog: a read/write request p p default_write(oid, iocb){ … //check if epoch in iocb matches the system epoch if(iocb->epoch < sys_epoch()){ debug msg } write journal //write data to object file size=xpwrite(fd, iocb->buf, length, offset); } default_read(oid, iocb){ … //get the path of the object file to read get_store_path(oid, iocb->ec_index, path) //read from object file to iocb buffer default_read_from_path(oid, path, iocb) if(no object found && 0<(iocb->epoch)<sys_epoch()) //if not able to read object, read from stale object read_from_stale_path(oid, path, iocb); } CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University

Please take a moment to fill out your L. I. F. E. forms. http:

Please take a moment to fill out your L. I. F. E. forms. http: //iucrclife. chass. ncsu. edu/lifeforms/ What do you like about this project? What would you change? (Please include all relevant feedback. ) CAC@TTU Semiannual Meeting April 9 -10, 2015 Texas Tech University 15