MODULE 3 DATA PROTECTION RAID EMC Proven Professional
- Slides: 45
MODULE – 3 DATA PROTECTION – RAID EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 1
Module 3: Data Protection – RAID Upon completion of this module, you should be able to: • Describe RAID implementation methods • Describe three RAID techniques • Describe commonly used RAID levels • Describe the impact of RAID on performance • Compare RAID levels based on their cost, performance, and protection EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 2
Module 3: Data Protection – RAID Lesson 1: RAID Overview During this lesson the following topics are covered: • RAID Implementation methods • RAID array components • RAID techniques EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 3
What is RAID? EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 4
What is RAID? RAID (redundant array of independent disk) It is a technique that combines multiple disk drives into a logical unit (RAID set) and provides protection, performance, or both. • RAID storage uses multiple disks in order to provide fault tolerance, to improve overall performance, and to increase storage capacity in a system (older storage devices used only a single disk drive to store data). RAID allows you to store the same data redundantly (in multiple places) in a balanced way to improve overall performance. If one disk fails, the data is preserved. RAID disk drives are used frequently on servers but aren't generally necessary for personal computers. • • 2 EMC PROVEN PROFESSIONAL EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 5
Why RAID? • Due to mechanical components in a disk drive it offers limited • performance (wear and tear and other environmental factors, which could result in data loss) An individual drive has a certain life expectancy and is measured in MTBF (mean time between failure) 4 The more the number of HDDs in a storage array, the larger the probability for disk failure. For example: 8 If the MTBF of a drive is 750, 000 hours, and there are 100 drives in the array, then the MTBF of the array becomes 750, 000 / 100, or 7, 500 hours • RAID was introduced to mitigate these problems • RAID provides: Increase capacity , higher availability, Increased performance EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 6
RAID Implementation Methods Software RAID implementation Hardware RAID Implementation • Runs as part of the OS • Performance is dependent on CPU workload • Does not support all RAID levels • Uses host-based software to provide RAID functionality • Limitations • Use host CPU cycles to perform RAID calculations, hence impact overall system performance • Support limited RAID levels • RAID software and OS can be upgraded only if they are compatible • Uses a specialized hardware controller installed either on a host or on an array • Controller card RAID • Installed in the host and disk drives are connected to it • Not efficient in a data center environment with a large number of hosts. • External RAID controller • Acts as an interface between host and disks. • Present storage volumes to the host, host manage volumes as physical drives. EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 7
RAID Controller Physical Array Logical Array (RAID Sets) RAID Controller Hard Disks Host RAID Array 1. Management and control of disk aggregations 2. Translation of I/O requests between logical disks and physical disks. 3. Data regeneration in the event of disk failures. EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 8
RAID Array Components Physical Array Logical Array (RAID Sets) RAID Controller Hard Disks Host RAID Array • A RAID array is an enclosure that contains a number of disk drives and • supporting hardware to implement RAID. A subset of disks within a RAID array can be grouped to form logical associations called logical arrays, also known as a RAID set or a RAID group. EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 9
RAID Techniques • Three key techniques used for RAID are: 4 Striping 4 Mirroring 4 Parity • These techniques determine the data availability and performance characteristics of a RAID set EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 10
RAID Technique – Striping Strip A 1 RAID Controller A 2 Stripe A 3 Host • A technique of spreading data across multiple drives (more than one) in order to use the drives in parallel. • Splitting up files into small pieces and distributing them to multiple hard disks • All the read-write heads work simultaneously 4 Allow more data to be processed in a shorter time 4 Increase performance • Does not provide data protection unless parity or mirroring is used. EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 11
Striping Stripe Strip 1 Strip 2 Strip 3 Stripe 1 Stripe 2 Strips 2 EMC PROVEN PROFESSIONAL EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 12
Strip Size, Stripe Width & Stripe Size Strip Size • No of blocks in a trip and the maximum amount of data that can be written to or read from a single disk in the set. • Smaller strip size = data is broken into smaller pieces while spread across the disks. • E. g. : 128 KB Stripe Width • No of data disks in the RAID set. • E. g. : 4 disk stripe EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Stripe Size • Multiple of strip size by the no of data disks in the RAID set. • E. g. : 4 disk stripe RAID set • = 4 x 128 KB • = 512 KB Module 3: Data Protection - RAID 13
RAID Technique – Mirroring Block 0 RAID Controller Block 0 Host • A technique whereby the same data is stored on two different • disk drives, yielding two copies of the data. If one disk drive failure occurs, the data is intact (unharmed) on the surviving disk drive and the controller continues to service the host’s data requests from the surviving disk of a mirrored pair. EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 14
Mirroring • Involves duplication of data – amount of storage capacity • needed is twice the amount of data being stored. Advantages: 4 Improves read performance – read requests can be serviced by both disks 4 Preferred for mission-critical applications that cannot afford the risk of any data loss. • Disadvantage: 4 Decreases write performance – each write manifests as two writes on the disk drives. 4 Expensive EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 15
RAID Technique – Parity • Parity is used in RAID drive arrays for fault tolerance by • • • calculating the data in two drives and storing the results on a third. 4 An additional disk drive is added to hold parity, a mathematical construct that allows recreation of the missing data. A method to protect striped data from disk drive failure without the cost of mirroring. Parity is a redundancy technique that ensures protection of data without maintaining a full set of duplicate data. Calculation of parity is a function of the RAID controller. EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 16
RAID Technique – Parity 3 1 2 Host Data Disks 1 1 3 2 2 1 RAID Controller 3 1 2 In this example: parity information is the sum of the elements in each row. Actual parity calculation is a bitwise XOR operation EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. 9 5 8 Parity Disk stores the parity information Module 3: Data Protection - RAID 17
Data Recovery in Parity Technique 4 D 1 6 D 2 RAID Controller ? D 3 Host Regeneration of data when Drive D 3 fails: 7 D 4 18 4 + 6 + ? + 7 = 18 P ? = 18 – 4 – 6 – 7 ? =1 • If one of the data disks fails, the missing value can be calculated by subtracting the sum of the rest of the elements from the parity value. • Here, for simplicity, the computation of parity is represented as an arithmetic sum of the data. • However, parity calculation is a bitwise XOR operation. EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 18
Data Recovery in Parity Technique • Compared to mirroring, parity implementation considerably • reduces the cost associated with data protection. Consider an example of a parity RAID configuration with five disks where four disks hold data, and the fifth holds the parity information. 4 In this example, parity requires only 25 percent extra disk space compared to mirroring, which requires 100 percent extra disk space. • Disadvantages of using parity: 4 Parity information is generated from data on the data disk. 4 Parity is recalculated every time there is a change in data. 4 This recalculation is time-consuming and affects the performance of the RAID array. EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 19
Module 3: Data Protection – RAID Lesson 2: RAID Levels During this lesson the following topics are covered: • Commonly used RAID levels • RAID impacts on performance • RAID comparison • Hot spare EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 20
RAID Levels • RAID levels are defined on the basis of striping, mirroring, and • • parity techniques. Some RAID levels use a single technique, whereas others use a combination of techniques. Commonly used RAID levels are: 4 RAID 0 – Striped set with no fault tolerance 4 RAID 1 – Disk mirroring 4 RAID 1 + 0 – Nested RAID 4 RAID 3 – Striped set with parallel access and dedicated parity disk 4 RAID 5 – Striped set with independent disk access and a distributed parity 4 RAID 6 – Striped set with independent disk access and dual distributed parity EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 21
RAID 0 – Striped Array With No Fault Tolerance • Uses data striping techniques • Data is distributed across the HDDs in the • • RAID set. Allows multiple data to be read or written simultaneously, and therefore improves performance. Does not provide data protection and availability A 1 B 1 C 1 A 2 B 2 C B A Data from host RAID Controller A 3 B 3 C 3 A 4 B 4 C 4 A 5 B 5 C 5 Data Disks EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 22
RAID 1 - Disk mirroring F E D C B A • Based on the mirroring technique • Data is mirrored to provide fault • • • tolerance Every write is written to both disks Suitable for applications that require high availability and cost is no constraint Mirroring is NOT the same as doing backup! A B C Mirror Set EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Data from host RAID Controller A B C D E F Mirror Set Module 3: Data Protection - RAID 23
Nested RAID • Combines the performance benefits of RAID 0 with the • redundancy benefits of RAID 1 Requires an even number of disks, minimum being four • RAID 1+0 – Striped Mirror – RAID 10 (Ten) or RAID 1/0 • Data is first mirrored, and then both copies are striped across multiple • HDDs. • When a drive fails, data is still accessible from its mirror. • Rebuild operation only requires data to be copied from the surviving disk into the replacement disk. RAID 0+1 – Mirrored Stripe – RAID 01 or RAID 0/1 • Data is striped across HDDs, then the entire stripe is mirrored. • If one drive fails, the entire stripe is faulted. • Rebuild operation requires data to be copied from each disk in the healthy stripe, causing increased load on the surviving disks. EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 24
Nested RAID – 1+0 (Striped Mirror) C B A Data is first mirrored, and then both copies are striped across multiple HDDs. Data from host RAID Controller Striping Mirroring A 1 B 1 C 1 Mirror Set A A 1 B 1 C 1 A 2 B 2 C 2 Mirror Set B EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. A 2 B 2 C 2 Mirroring A 3 B 3 C 3 Mirror Set C Module 3: Data Protection - RAID 25
Nested RAID – 1+0 (Striped Mirror) • 6 disks forming a RAID 1+0 (RAID 1 first and then RAID 0) set. 4 3 sets of two disks 4 Each set acts as a RAID 1 (mirrored pair of disks) • Data is striped across all 3 mirrored sets to form RAID 0. 4 Drives 1 + 2 = RAID 1 (Mirror Set A) 4 Drives 3 + 4 = RAID 1 (Mirror Set B) 4 Drives 5 + 6 = RAID 1 (Mirror Set C) • If drive 5 fails, then the mirror set C alone is • affected. Drive 6 continues to function and the entire RAID 1+0 array also keeps functioning. In this configuration, up to 3 drives can fail without affecting the array EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 26
Nested RAID – 0+1 (Mirrored Stripe) C B A Data is striped across HDDs, then the entire stripe is mirrored. Data from host RAID Controller Mirroring Striping A 1 B 1 C 1 A 2 B 2 C 2 Stripe Set A A 3 B 3 C 3 Striping A 2 B 2 C 2 A 1 B 1 C 1 Stripe Set B EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. A 3 B 3 C 3 A 1 B 1 C 1 A 2 B 2 C 2 A 3 B 3 C 3 Stripe Set C Module 3: Data Protection - RAID 27
Nested RAID – 0+1 (Mirrored Stripe) • 6 disks forming a RAID 0+1 (RAID 0 first and then RAID 1) set. 4 2 sets of three disks 4 Each set acts as a RAID 0 • Each set contains 3 disks and two sets are mirrored to form RAID 1. 4 Drives 1 + 2 + 3 = RAID 0 (Stripe Set A) 4 Drives 4 + 5 + 6 = RAID 0 (Stripe Set B) • If one of the drives, say drive 3, fails, the • entire stripe set A fails. A rebuild operation copies the entire stripe, copying the data from each disk in the healthy stripe EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 28
RAID 3 - Striped set with parallel access and dedicated parity disk • RAID 3 stripes data for • C B A performance and uses parity for fault tolerance. Parity information is stored on a dedicated drive so that the data can be reconstructed if a drive fails in a RAID set. A 1 B 1 C 1 Data from host RAID Controller A 2 B 2 C 2 Data Disks EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. A 3 B 3 C 3 A 4 B 4 C 4 AP BP CP Dedicated Parity Disk Module 3: Data Protection - RAID 29
RAID 5 - Striped set with independent disk access and a distributed parity • Parity is distributed across all disks to overcome the write bottleneck of a dedicated parity disk. C B A Data from host RAID Controller AP B 1 C 1 A 1 BP C 2 A 2 B 2 CP A 3 B 3 C 3 A 4 B 4 C 4 Distributed Parity EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 30
RAID 6 - Striped set with independent disk access and dual distributed parity • RAID 6 works the same way as RAID 5, except that RAID 6 includes a second parity element to enable survival if two disk failures occur in a RAID set. C B A Data from host RAID Controller A 1 BP C 1 AQ B 1 CP A 2 B 2 CQ A 3 BQ C 2 AP B 3 C 3 Dual Distributed Parity EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 31
RAID Impacts on Performance • In both mirrored and parity RAID configurations, every write • operation translates into more I/O overhead for the disks, which is referred to as a write penalty. E. g. : In RAID 5 implementation, a write operation may manifest as four I/O operations. 4 When performing I/Os to a disk configured with RAID 5, the controller has to read, recalculate, and write a parity segment for every data write operation. 4 The parity (P) at the controller is calculated as follows: 8 Ep = E 1 + E 2 + E 3 + E 4 (XOR operations) EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 32
RAID Impacts on Performance • Whenever the controller performs a write I/O, parity must be computed by • reading the old parity (Ep old) and the old data (E 4 old) from the disk, which means two read I/Os. Then, the new parity (Ep new) is computed as follows: 4 Ep new = Ep old – E 4 old + E 4 new (XOR operations) EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 33
RAID Impacts on Performance • After computing the new parity, the controller completes the write I/O by • writing the new data and the new parity onto the disks, amounting to two write I/Os. Therefore, the controller performs two disk reads and two disk writes for every write operation, and the write penalty is 4. EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 34
RAID Penalty • Read operation does not incur any penalty 4 Any read requires only a single operation (i. e. penalty of 1) • With write operation, the penalty depends on the RAID configuration. RAID Level Write Penalty 0 1 1 2 0+1 / 1+0 2 3 3 5 4 6 6 2 EMC PROVEN PROFESSIONAL EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 35
RAID Penalty Calculation Example 1 • Consider an application that generates 1200 IOPS at peak workload, with read/write ratio of 2: 1. Calculate disk load at peak activity for RAID 1/0 and RAID 5 configuration. SUMMARY: Total IOPS at peak workload is 1200 Read/Write ratio 2: 1 Calculate disk load at peak activity for: RAID 1/0 RAID 5 EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 36
Solution: Penalty Calculation Example 1 • For RAID 1/0, the disk load (read + write) = (1200 x 2/3) + (1200 x (1/3) x 2) [because the write penalty for RAID 1/0 is 2] = 800 + 800 = 1600 IOPS • For RAID 5, the disk load (read + write) = (1200 x 2/3) + (1200 x (1/3) x 4) [because the write penalty for RAID 5 is 4] = 800 + 1600 = 2400 IOPS EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 37
Solution: No. of Disks for RAID Configuration Example 1 • The computed disk load determines the number of disks • required for the application. If in this example a disk drive with a specification of a maximum 140 IOPS needs to be used, the number of disks required to meet the workload for the RAID configuration: 4 RAID 1/0: 1600/140 = 11. 4 8 = 12 disks (approximated to the nearest even number) 4 RAID 5: 2400/140 = 17. 1 8 = 18 disks EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 38
RAID Penalty Calculation Example 2 • Consider an application that generates 5200 IOPS at peak workload, with 60% of them being reads. Calculate disk load at peak activity for RAID 1 and RAID 5 configuration. SUMMARY: Total IOPS at peak workload is 5200 Read/Write ratio 60%: 40% Calculate disk load at peak activity for: RAID 1 RAID 5 2 EMC PROVEN PROFESSIONAL EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 39
Solution: Penalty Calculation Example 2 • For RAID 1, the disk load (read + write) = (5200 x 0. 6) + (5200 x (0. 4) x 2) [because the write penalty for RAID 1 is 2] = 3120+ 4160 = 7280 IOPS • For RAID 5, the disk load (read + write) = (5200 x 0. 6) + (5200 x (0. 4) x 4) [because the write penalty for RAID 5 is 4] = 3120 + 8320 = 11400 IOPS EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 40
Solution: No. of Disks for RAID Configuration Example 2 • The computed disk load determines the number of disks • required for the application. If in this example a disk drive with a specification of a maximum 180 IOPS needs to be used, the number of disks required to meet the workload for the RAID configuration: 4 RAID 1: 7, 280/180 = 40. 44 8 = 42 disks (approximated to the nearest even number) 4 RAID 5: 11, 440/180 = 63. 5 8 = 64 disks EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 41
RAID Comparison RAID level 0 Min disks 2 Available storage capacity (%) Read performance 100 Very good for both random and sequential read Write penalty Cost Protection Low Null Slower than single disk, because every write must be committed to all disks Moderate High Mirror Write performance Very good 1 2 50 Better than single disk 1+0 4 50 Good Poor to fair for small random writes fair for large, sequential writes High Moderate 3 3 [(n-1)/n]*100 Fair for random reads and good for sequential reads 5 3 [(n-1)/n]*100 Good for random and sequential reads Fair for random and sequential writes High Moderate 6 4 [(n-2)/n]*100 Good for random and sequential reads Poor to fair for random and sequential writes Very High Moderate but more than RAID 5 Parity (Supports single disk failure) Parity (Supports two disk failures) where n = number of disks EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 42
Suitable RAID Levels for Different Applications • RAID 1+0 4 Suitable for applications with small, random, and write intensive (writes typically greater than 30%) I/O profile 4 Example: online transaction processing (OLTP), RDBMS – Temp space • RAID 3 4 Large, sequential read and write 4 Example: data backup and multimedia streaming • RAID 5 and 6 4 Small, random workload (writes typically less than 30%) 4 Example: email, RDBMS – Data entry EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 43
Hot Spare • Refers to a spare drive • in a RAID array that temporarily replaces a failed disk drive by taking the identity of the failed disk drive When a new disk drive is added to the system, data from the hot spare is copied to it. The hot spare returns to its idle state, ready to replace the next failed drive Failed disk RAID Controller EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Replace failed disk Hot spare Module 3: Data Protection - RAID 44
Module 3: Summary Key points covered in this module: • RAID implementation methods and techniques • Common RAID levels • RAID write penalty • Compare RAID levels based on their cost and performance EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. Module 3: Data Protection - RAID 45
- Emc proven professional
- Raid system of data protection
- What is a certified protection professional
- Professional curiosity in nursing
- Module sur la protection transversale
- Chigo e3 error code
- Data domain dd160
- Data analytics lifecycle example
- Avamar ndmp accelerator
- Proved right
- Adopting proven technology instead of experimental
- Allscripts tiger
- Proven amazon course for $99
- Proven in use
- 4-5 triangle congruence sss and sas
- C device module module 1
- Lindisfarne raid primary sources
- Cache set associative mapping
- Cache tag size raid 0
- Jelena raid
- Raid 5 nasıl yapılır
- Abcdef bundle
- Raid duplexing
- Dynamic drive pool
- Raid yapmak
- Calcolo raid 5
- Two key details beating of senator sumner
- John brown's raid on harpers ferry apush
- Raid architecture
- Raid windows
- Raid
- Raid 5 nedir
- Raid in dbms
- Raid 0
- Raid
- Recuperação de raid
- Raid
- Kabaddi rules
- Raid powermats
- Raid diagram
- Raid technology definition
- A case for redundant arrays of inexpensive disks
- Where did vikings live
- Vacacc
- The dieppe raid
- Raid levels