The 5 Minute Rule Jim Gray Microsoft Research

  • Slides: 20
Download presentation
The 5 Minute Rule Jim Gray Microsoft Research Gray@Microsoft. com http: //www. Research. Microsoft.

The 5 Minute Rule Jim Gray Microsoft Research Gray@Microsoft. com http: //www. Research. Microsoft. com/~Gray/talks Kilo Mega Giga Tera Peta Exa 103 106 109 1012 1015 1018 today, we are here 1

Storage Hierarchy (9 levels) • Cache 1, 2 • Main (1, 2, 3 if

Storage Hierarchy (9 levels) • Cache 1, 2 • Main (1, 2, 3 if n. UMA). • Disk (1 (cached), 2) • Tape (1 (mounted), 2) 2

Meta-Message: Technology Ratios Are Important • If everything gets faster & cheaper at the

Meta-Message: Technology Ratios Are Important • If everything gets faster & cheaper at the same rate THEN nothing really changes. • Things getting MUCH BETTER: – – – communication speed & cost 1, 000 x processor speed & cost 100 x storage size & cost 100 x • Things staying about the same – speed of light (more or less constant) – people (10 x more expensive) – storage speed (only 10 x better) 3

Today’s Storage Hierarchy : Speed & Capacity vs Cost Tradeoffs Size vs Speed 1012

Today’s Storage Hierarchy : Speed & Capacity vs Cost Tradeoffs Size vs Speed 1012 109 106 103 104 Cache Nearline Tape Offline Main 102 Tape Disc Secondary Online Secondary Tape 0 Tape 10 Disc Main Offline Nearline Tape -2 $/MB Typical System (bytes) 1015 Price vs Speed 10 Cache 10 -9 10 -6 10 -3 10 0 10 3 Access Time (seconds) 10 -4 10 -9 10 -6 10 -3 10 0 10 3 Access Time (seconds) 4

Storage Ratios Changed • 10 x better access time • 10 x more bandwidth

Storage Ratios Changed • 10 x better access time • 10 x more bandwidth • 4, 000 x lower media price • DRAM/DISK 100: 1 to 10: 10 to 50: 1 5

Thesis: Performance =Storage Accesses not Instructions Executed • In the “old days” we counted

Thesis: Performance =Storage Accesses not Instructions Executed • In the “old days” we counted instructions and IO’s • Now we count memory references • Processors wait most of the time 6

The Pico Processor 1 M SPECmarks 106 clocks/ fault to bulk ram Event-horizon on

The Pico Processor 1 M SPECmarks 106 clocks/ fault to bulk ram Event-horizon on chip. VM reincarnated Multi-program cache Terror Bytes! 7

Storage Latency: How Far Away is the Data? 10 9 Andromeda Tape /Optical Robot

Storage Latency: How Far Away is the Data? 10 9 Andromeda Tape /Optical Robot 10 6 Disk 100 10 2 1 Memory On Board Cache On Chip Cache Registers 2, 000 Years Pluto Sacramento 2 Years 1. 5 hr This Campus 10 min This Room My Head 1 min 8

The Five Minute Rule • Trade DRAM for Disk Accesses • Cost of an

The Five Minute Rule • Trade DRAM for Disk Accesses • Cost of an access (Drive. Cost / Access_per_second) • Cost of a DRAM page ( $/MB / pages_per_MB) • Break even has two terms: • Technology term and an Economic term • Grew page size to compensate for changing ratios. • Still at 5 minute for random, 1 minute sequential 9

Shows Best Page Index Page Size ~16 KB 10

Shows Best Page Index Page Size ~16 KB 10

Standard Storage Metrics • Capacity: – RAM: MB and $/MB: today at 10 MB

Standard Storage Metrics • Capacity: – RAM: MB and $/MB: today at 10 MB & 100$/MB – Disk: GB and $/GB: today at 10 GB and 200$/GB – Tape: TB and $/TB: today at. 1 TB and 25 k$/TB (nearline) • Access time (latency) – – – RAM: 100 ns Disk: 10 ms Tape: 30 second pick, 30 second position • Transfer rate – RAM: – Disk: – Tape: 1 GB/s 5 MB/s - - - Arrays can go to 1 GB/s 5 MB/s - - - striping is problematic 11

New Storage Metrics: Kaps, Maps, SCAN? • Kaps: How many kilobyte objects served per

New Storage Metrics: Kaps, Maps, SCAN? • Kaps: How many kilobyte objects served per second – The file server, transaction processing metric – This is the OLD metric. • Maps: How many megabyte objects served per second – The Multi-Media metric • SCAN: How long to scan all the data – the data mining and utility metric • And – Kaps/$, Maps/$, TBscan/$ 12

For the Record (good 1998 devices packaged in system ) http: //www. tpc. org/results/individual_results/Dell/dell.

For the Record (good 1998 devices packaged in system ) http: //www. tpc. org/results/individual_results/Dell/dell. 6100. 9801. es. pdf X 14 13

How To Get Lots of Maps, SCANs • parallelism: use many little devices in

How To Get Lots of Maps, SCANs • parallelism: use many little devices in parallel At 10 MB/s: 1. 2 days to scan 1, 000 x parallel: 100 seconds SCAN. Parallelism: divide a big problem into many smaller ones to be solved in parallel. • Beware of the media myth • Beware of the access time myth 14

The Disk Farm On a Card The 100 GB disc card An array of

The Disk Farm On a Card The 100 GB disc card An array of discs Can be used as 100 discs 1 striped disc 10 Fault Tolerant discs. . etc LOTS of accesses/second bandwidth 14" Life is cheap, its the accessories that cost ya. Processors are cheap, it’s the peripherals that cost ya (a 10 k$ disc card). 15

Tape Farms for Tertiary Storage Not Mainframe Silos 100 robots 1 M$ 50 TB

Tape Farms for Tertiary Storage Not Mainframe Silos 100 robots 1 M$ 50 TB 50$/GB 3 K Maps 10 K$ robot 14 tapes 27 hr Scan 500 GB 5 MB/s 20$/GB Scan in 27 hours. independent tape robots 30 Maps many (like a disc farm) 16

The Metrics: Disk and Tape Farms Win GB/K$ 1, 000 Kaps 100, 000 Maps

The Metrics: Disk and Tape Farms Win GB/K$ 1, 000 Kaps 100, 000 Maps Data Motel: Data checks in, but it never checks ou SCANS/Day 10, 000 100 10 1 0. 01 1000 x Disc Farm STC Tape Robot 6, 000 tapes, 8 readers 100 x DLT Tape Farm 17

Tape & Optical: Beware of the Media Myth Optical is cheap: 200 $/platter 2

Tape & Optical: Beware of the Media Myth Optical is cheap: 200 $/platter 2 GB/platter => 100$/GB (2 x cheaper than disc) Tape is cheap: => 1. 5 $/GB 30 $/tape 20 GB/tape (100 x cheaper than disc). 18

Tape & Optical Reality: Media is 10% of System Cost Tape needs a robot

Tape & Optical Reality: Media is 10% of System Cost Tape needs a robot (10 k$. . . 3 m$ ) 10. . . 1000 tapes (at 20 GB each) => 20$/GB. . . 200$/GB (1 x… 10 x cheaper than disc) Optical needs a robot (100 k$ ) 100 platters = 200 GB ( TODAY ) => 400 $/GB ( more expensive than mag disc ) Robots have poor access times Not good for Library of Congress (25 TB) Data motel: data checks in but it never checks out! 19

The Access Time Myth The Myth: seek or pick time dominates The reality: (1)

The Access Time Myth The Myth: seek or pick time dominates The reality: (1) Queuing dominates (2) Transfer dominates BLOBs (3) Disk seeks often short Implication: many cheap servers better than one fast expensive server – shorter queues – parallel transfer – lower cost/access and cost/byte This is now obvious for disk arrays This will be obvious for tape arrays 20