High Performance Computing Course Notes 2007 2008 High

Storage devices àPrimary storage: q register (1 CPU cycle, a few ns) q Cache

Hard disk vs. solid state drive a) 2. 5 -inch hard disk b) solid

Tape library Computer Science, University of Warwick 4

Disks Computer Science, University of Warwick 5

Disk failure and metrics q mean time between failures (MTBF): Mean time between failures

Solutions for disk failures àRedundancy q Replication (mirroring) àPartial q Redundancy Parity information Computer

RAID àRAID: q Redundant Arrays of Inexpensive Disks Goals: increased data reliability and increased

RAID àDisadvantage: failures AFRdisks=Ndisks*AFRdisk àSolution q Redundancy: • 1) replication/mirroring: need more space •

Parity àParity calculation is performed using “XOR”. q XOR operator is "true" if and

Disk arrays taxonomy àRAID levels q 0: stripping without redundancy q 1: full copy

RAID levels àRAID 0 q Stripped without redundancy q Data can be read off

àRAID 3 q Striped set with dedicated parity q single parity disk is a

àRAID 5 q Striped set with distributed parity q the array is not destroyed

àRAID 6 q Striped set with dual parity. q Provides fault tolerance from two

Computer Science, University of Warwick 16

Computer Science, University of Warwick 17

Network Attached Storage (NAS) àFollows a client/server design àA NAS head acts as the

Storage Area Networks (SANs) àAn architecture to attach remote computer storage devices to servers

NAS vs. SAN Computer Science, University of Warwick 20

Slides: 20

Download presentation

High Performance Computing Course Notes 2007 -2008 High Performance Storage

Storage devices àPrimary storage: q register (1 CPU cycle, a few ns) q Cache (10 -200 cycles, 0. 02 -0. 5 us) q Main memory • Local main memory (0. 2 -4 us) • NUMA (2 -10 xlocal memory) àSecondary storage: q Magnetic disk (2 -20 ms) q Solid state disk (0. 05 -0. 5 ms) q Cache in storage controller (0. 05 -0. 5 ms) àTertiary storage q Removable media: tapes, floppies, CDs (ms-minutes) q Tape library (few seconds – few minutes) Computer Science, University of Warwick 2

Hard disk vs. solid state drive a) 2. 5 -inch hard disk b) solid state drive Computer Science, University of Warwick 3

Tape library Computer Science, University of Warwick 4

Disks Computer Science, University of Warwick 5

Disk failure and metrics q mean time between failures (MTBF): Mean time between failures (MTBF) is the average time between failures of a disk MTBF= (downtime-uptime)/number-of-failures q Annual failure rate (AFR): number of failures per year AFR=running-hours-per-year/MTBF AFRdisks=Ndisks*AFRdisk Computer Science, University of Warwick 6

Solutions for disk failures àRedundancy q Replication (mirroring) àPartial q Redundancy Parity information Computer Science, University of Warwick 7

RAID àRAID: q Redundant Arrays of Inexpensive Disks Goals: increased data reliability and increased I/O performance àMain concepts in RAID q Mirroring q stripping q parity àAdvantages: q High capability q High performance: data stripe q Graceful degrading q One disk fails, only that disk needs to be replaced Computer Science, University of Warwick 8

RAID àDisadvantage: failures AFRdisks=Ndisks*AFRdisk àSolution q Redundancy: • 1) replication/mirroring: need more space • 2) parity: recover from single disk failure; need more operations to maintain parity info and recover Computer Science, University of Warwick 9

Parity àParity calculation is performed using “XOR”. q XOR operator is "true" if and only if one of its operands is true q Property of XOR: • If Dp=D 1 XOR …Dk … XOR Dn, then Dk = Dp XOR D 1 … Dk-1 XOR Dk+1…XOR Dn àTherefore, if any data is lost, we can recover the data from parity and the remaining data àAdvantages: only one of the "N+1" drives contains redundancy information àDisadvantages: parity information has to be computed every time the data is updated Computer Science, University of Warwick 10

Disk arrays taxonomy àRAID levels q 0: stripping without redundancy q 1: full copy mirroring q 2: Hamming-code q 3: separate disk for parity q 4: data of a file are put in a single disk q 5: rotated distributed parity q 6: double parity They are just classifications rather than a ordered list Computer Science, University of Warwick 11

RAID levels àRAID 0 q Stripped without redundancy q Data can be read off in parallel q Any disk failure destroys the entire array àRAID 1 q Mirrored q Array continues to operate so long as at least one drive is functioning Computer Science, University of Warwick 12

àRAID 3 q Striped set with dedicated parity q single parity disk is a bottleneck for writing q Byte-level striping (typically under 1 k) àRAID 4 q Identical to RAID 3 but does block-level striping instead of byte-level striping q The block can be of any size Computer Science, University of Warwick 13

àRAID 5 q Striped set with distributed parity q the array is not destroyed by a single drive failure q Upon drive failure, any subsequent reads can be calculated from the distributed parity q The array will have data loss in the event of a second drive failure Computer Science, University of Warwick 14

àRAID 6 q Striped set with dual parity. q Provides fault tolerance from two drive failures Computer Science, University of Warwick 15

Computer Science, University of Warwick 16

Computer Science, University of Warwick 17

Network Attached Storage (NAS) àFollows a client/server design àA NAS head acts as the interface between the NAS and network clients àThe NAS appears on the network as a single "node" that is the IP address of the head device àClients access a NAS over an Ethernet connection àThe NAS devices require no monitor, keyboard or mouse and run an embedded os àNAS uses file-based application protocols such as NFS (Network File System) and CIFS (Common Internet File System) Computer Science, University of Warwick 18

Storage Area Networks (SANs) àAn architecture to attach remote computer storage devices to servers in such a way that the devices appear as locally attached to the OS àThe data is accessed in blocks àUse Fibre. Channel protocol to access data Computer Science, University of Warwick 19

NAS vs. SAN Computer Science, University of Warwick 20