Hy Log A High Performance Approach to Managing

  • Slides: 31
Download presentation
Hy. Log: A High Performance Approach to Managing Disk Layout Wenguang Wang Yanping Zhao

Hy. Log: A High Performance Approach to Managing Disk Layout Wenguang Wang Yanping Zhao Rick Bunt Department of Computer Science University of Saskatchewan Saskatoon, Canada April 1, 2004 USENIX FAST 2004 Wenguang Wang

Background • The write performance of a storage system is impacted by – the

Background • The write performance of a storage system is impacted by – the disk characteristics • Disk positioning time • Transfer bandwidth – the strategy for writing • Overwrite • LFS (Log-structured File System) April 1, 2004 USENIX FAST 2004 Wenguang Wang 2

Overwrite • Idea: new data are overwritten on top of old data • Problems:

Overwrite • Idea: new data are overwritten on top of old data • Problems: lots of time lost in disk arm positioning in workloads with small writes scattered over the disk April 1, 2004 USENIX FAST 2004 Wenguang Wang 3

LFS • Idea: new data are accumulated and written to new disk locations in

LFS • Idea: new data are accumulated and written to new disk locations in large sequential transfers • Assumptions of the disk characteristics: large sequential transfers are more efficient than small block transfers April 1, 2004 USENIX FAST 2004 Wenguang Wang 4

LFS (cont. ) • Advantages – – good write performance no small write penalty

LFS (cont. ) • Advantages – – good write performance no small write penalty on RAID-5 fast recovery easy to support snapshot and versioning (WAFL) • Problems: segment cleaning is expensive – For a year 1991 disk, TPC-B workload, and 50% disk space utilization, cleaning overhead reduces overall system throughput by 33% (Seltzer et al. USENIX’ 95) April 1, 2004 USENIX FAST 2004 Wenguang Wang 5

Motivation • Observation: disk sequential transfer bandwidth has improved 10 x more than positioning

Motivation • Observation: disk sequential transfer bandwidth has improved 10 x more than positioning time DEC RZ 26 Cheetah X 15 36 LP Diff. (year 2003) (year 1991) Positioning time Transfer B/w 15 ms 5. 6 ms 2. 7 x 2. 3 MB/s 61 MB/s 27 x • Question: how are Overwrite and LFS affected by this trend? April 1, 2004 USENIX FAST 2004 Wenguang Wang 6

Objective • Revisit the performance of LFS under modern and future disks • Evaluate

Objective • Revisit the performance of LFS under modern and future disks • Evaluate the performance of LFS under disk arrays and concurrent users • Attempt to perform better than LFS and Overwrite April 1, 2004 USENIX FAST 2004 Wenguang Wang 7

Outline • • • Background, Motivation, and Objective The analysis of LFS and Overwrite

Outline • • • Background, Motivation, and Objective The analysis of LFS and Overwrite The design of Hy. Log Experimental methodology and results Conclusions and future work April 1, 2004 USENIX FAST 2004 Wenguang Wang 8

Experimental Parameters • Three SCSI disks – DEC RZ 26 (year 1991) – Quantum

Experimental Parameters • Three SCSI disks – DEC RZ 26 (year 1991) – Quantum atlas 10 k (year 1999) – Cheetah X 15 36 LP (year 2003) • Page size: 8 KB • Workload: uniformly distributed random update (TPC-B) April 1, 2004 USENIX FAST 2004 Wenguang Wang 9

Modeling Write Performance • In Overwrite: • In LFS: Time to write N pages

Modeling Write Performance • In Overwrite: • In LFS: Time to write N pages is T 1 Time to write a segment containing N pages is T 2 • T 1 > T 2 • Segment I/O Efficiency = T 1 / T 2 April 1, 2004 USENIX FAST 2004 Wenguang Wang 10

A Simple Scenario • Assume the segments to be cleaned are always 80% utilized

A Simple Scenario • Assume the segments to be cleaned are always 80% utilized (cleaning space utilization = 80%) • LFS requires 5 seg. reads and 4 seg. writes to reclaim a free segment • LFS requires 10 seg. I/Os (9 seg. for cleaning, 1 seg. for new data) to write a segment • If Segment I/O Efficiency > 10, LFS is still faster than Overwrite! April 1, 2004 USENIX FAST 2004 Wenguang Wang 11

Segment I/O Efficiency 33 April 1, 2004 USENIX FAST 2004 Wenguang Wang 12

Segment I/O Efficiency 33 April 1, 2004 USENIX FAST 2004 Wenguang Wang 12

Overwrite vs. LFS cleaning Overwrite 1999 disk LFS holeplugging 0. 88 April 1, 2004

Overwrite vs. LFS cleaning Overwrite 1999 disk LFS holeplugging 0. 88 April 1, 2004 USENIX FAST 2004 Wenguang Wang 13

Overwrite vs. LFS • The crossing point where LFS has the same performance as

Overwrite vs. LFS • The crossing point where LFS has the same performance as Overwrite Cleaning Space Year of Disk Utilization April 1, 2004 Disk Space Utilization 1991 0. 52 0. 74 1999 0. 88 0. 94 2003 0. 94 0. 97 USENIX FAST 2004 Wenguang Wang 14

Disk Access Characteristics • In most workloads, most writes are to a small number

Disk Access Characteristics • In most workloads, most writes are to a small number of pages (hot pages) • Impact of skewness on LFS performance – Most of the cleaning cost comes from cold pages – Most of the good write performance comes from accumulating the writes to hot pages April 1, 2004 USENIX FAST 2004 Wenguang Wang 15

Hy. Log (Hybrid Log-structured Approach) • Separates the disk into two partitions: hot partition

Hy. Log (Hybrid Log-structured Approach) • Separates the disk into two partitions: hot partition and cold partition • Uses log-structured approach to manage the hot partition • Uses overwrite to manage the cold partition April 1, 2004 USENIX FAST 2004 Wenguang Wang 16

Performance Potential of Hy. Log LFS Write Cost Overwrite LFS Overwrite Hy. Log Disk:

Performance Potential of Hy. Log LFS Write Cost Overwrite LFS Overwrite Hy. Log Disk: year 2003, workload: 80% references are in 20% pages April 1, 2004 USENIX FAST 2004 Wenguang Wang 17

Design of Hy. Log • Key design issue: page separating algorithm – Collects page

Design of Hy. Log • Key design issue: page separating algorithm – Collects page write frequencies – Finds the hot page proportion to minimize expected write cost – Determines the threshold of write frequency from the desired hot page proportion – Uses the threshold to distinguish hot pages from cold pages April 1, 2004 USENIX FAST 2004 Wenguang Wang 18

Evaluation Methodology • Trace driven simulation – Overwrite, LFS, WOLF, and Hy. Log are

Evaluation Methodology • Trace driven simulation – Overwrite, LFS, WOLF, and Hy. Log are implemented – TPC-C, Email, and OLTP traces – Year 1999, 2003, and 2008 disk models – No think time between requests • Metrics – Throughput: # I/O requests finished per second April 1, 2004 USENIX FAST 2004 Wenguang Wang 19

Results – Hy. Log Page Separating Algorithm Hy. Log adjusts the hot page proportion

Results – Hy. Log Page Separating Algorithm Hy. Log adjusts the hot page proportion between 35 -45% TPC-C trace with 20 users and 4 disks, 98% disk space utilization April 1, 2004 USENIX FAST 2004 Wenguang Wang 20

Results – Disk Space Utilization 1999 disk TPC-C trace with 20 users and 4

Results – Disk Space Utilization 1999 disk TPC-C trace with 20 users and 4 disks April 1, 2004 USENIX FAST 2004 Wenguang Wang 21

Results – Disk Type LFS/Hy. Log (year’ 03 disk) 2003 disk 1999 disk Overwrite

Results – Disk Type LFS/Hy. Log (year’ 03 disk) 2003 disk 1999 disk Overwrite (year’ 03 disk) TPC-C trace with 20 users and 4 disks April 1, 2004 USENIX FAST 2004 Wenguang Wang 22

Results – Disk Type (cont. ) LFS/Hy. Log (year’ 08 disk) LFS/Hy. Log (year’

Results – Disk Type (cont. ) LFS/Hy. Log (year’ 08 disk) LFS/Hy. Log (year’ 03 disk) 2008 disk Overwrite (year’ 08 disk) 2003 disk 1999 disk Overwrite (year’ 03 disk) TPC-C trace with 20 users and 4 disks April 1, 2004 USENIX FAST 2004 Wenguang Wang 23

Results – Number of Users TPC-C trace, disk space utilization 98%, year 1999 disk

Results – Number of Users TPC-C trace, disk space utilization 98%, year 1999 disk April 1, 2004 USENIX FAST 2004 Wenguang Wang 24

Results – Number of Disks TPC-C trace, disk space utilization 98%, year 1999 disk

Results – Number of Disks TPC-C trace, disk space utilization 98%, year 1999 disk April 1, 2004 USENIX FAST 2004 Wenguang Wang 25

Results – RAID-5 TPC-C trace, year 1999 disk, 8 -disk RAID-0, 9 -disk RAID-5

Results – RAID-5 TPC-C trace, year 1999 disk, 8 -disk RAID-0, 9 -disk RAID-5 April 1, 2004 USENIX FAST 2004 Wenguang Wang 26

Results – Other Traces Year 1999 disk April 1, 2004 USENIX FAST 2004 Wenguang

Results – Other Traces Year 1999 disk April 1, 2004 USENIX FAST 2004 Wenguang Wang 27

Conclusions • On modern and future disks, LFS significantly outperforms Overwrite unless the disk

Conclusions • On modern and future disks, LFS significantly outperforms Overwrite unless the disk space utilization is very high • Hy. Log performs comparably to the best of Overwrite, LFS, and WOLF April 1, 2004 USENIX FAST 2004 Wenguang Wang 28

Future Work • Add fast recovery support in Hy. Log – All meta-data are

Future Work • Add fast recovery support in Hy. Log – All meta-data are considered as hot pages • Stabilize Net. BSD LFS implementation and measure its performance • Implement and evaluate Hy. Log in Net. BSD April 1, 2004 USENIX FAST 2004 Wenguang Wang 29

April 1, 2004 USENIX FAST 2004 Wenguang Wang 30

April 1, 2004 USENIX FAST 2004 Wenguang Wang 30

Results – # Users and Disks Hy. Log LFS Nu m be ro f.

Results – # Users and Disks Hy. Log LFS Nu m be ro f. D is ks rs of Use r e b m Nu TPC-C trace, disk space utilization 98%, year 1999 disk April 1, 2004 USENIX FAST 2004 Wenguang Wang 31