Understanding ReducedVoltage Operation in Modern DRAM Devices Experimental

  • Slides: 51
Download presentation
Understanding Reduced-Voltage Operation in Modern DRAM Devices Experimental Characterization, Analysis, and Mechanisms Kevin Chang†

Understanding Reduced-Voltage Operation in Modern DRAM Devices Experimental Characterization, Analysis, and Mechanisms Kevin Chang† A. Giray Yaglikci†, Saugata Ghose†, Aditya Agrawal*, Niladrish Chatterjee*, Abhijith Kashyap†, Donghyuk Lee*, Mike O’Connor*, Hasan Hassan‡, Onur Mutlu†‡ † * ‡

Executive Summary • DRAM (memory) power is significant in today’s systems – Existing low-voltage

Executive Summary • DRAM (memory) power is significant in today’s systems – Existing low-voltage DRAM reduces voltage conservatively • Goal: Understand exploit the reliability and latency behavior of real DRAM chips under aggressive reducedvoltage operation • Key experimental observations: – Errors occur and increase with lower voltage – Errors exhibit spatial locality – Higher operation latency mitigates voltage-induced errors • Voltron: A new DRAM energy reduction mechanism – Reduce DRAM voltage without introducing errors – Use a regression model to select voltage that does not degrade performance beyond a chosen target 7. 3% system 2

Outline • • • Executive Summary Motivation DRAM Background Characterization of DRAM Voltron: DRAM

Outline • • • Executive Summary Motivation DRAM Background Characterization of DRAM Voltron: DRAM Energy Reduction Mechanism Conclusion 3

High DRAM Power Consumption • Problem: High DRAM (memory) power in today’s systems >40%

High DRAM Power Consumption • Problem: High DRAM (memory) power in today’s systems >40% in POWER 7 (Ware+, HPCA’ 10) >40% in GPU (Paul+, ISCA’ 15) 4

Low-Voltage Memory • Can we reduce DRAM power and energy by further reducing supply

Low-Voltage Memory • Can we reduce DRAM power and energy by further reducing supply voltage? 5

Goals 1 2 Understand characterize the various characteristics of DRAM under reduced voltage Develop

Goals 1 2 Understand characterize the various characteristics of DRAM under reduced voltage Develop a mechanism that reduces DRAM energy by lowering voltage while keeping performance loss within a target 6

Key Questions • How does reducing voltage affect reliability (errors)? • How does reducing

Key Questions • How does reducing voltage affect reliability (errors)? • How does reducing voltage affect DRAM latency? • How do we design a new DRAM energy reduction mechanism? 7

Outline • • • Executive Summary Motivation DRAM Background Characterization of DRAM Voltron: DRAM

Outline • • • Executive Summary Motivation DRAM Background Characterization of DRAM Voltron: DRAM Energy Reduction Mechanism Conclusion 8

High-Level DRAM Organization DRAM chips DRAM Channel DRAM Module 9

High-Level DRAM Organization DRAM chips DRAM Channel DRAM Module 9

DRAM Chip Internals DRAM Cell Bitline Peripheral Circuitry Control Logic I/O DRAM Array Bank

DRAM Chip Internals DRAM Cell Bitline Peripheral Circuitry Control Logic I/O DRAM Array Bank Wordline S S Sense amplifiers (row buffer) Off-chip channel 10

DRAM Operations 1 1 ACTIVATE: Store the row into the row buffer 2 READ:

DRAM Operations 1 1 ACTIVATE: Store the row into the row buffer 2 READ: Select the target cache line and drive to CPU 3 PRECHARGE: Prepare the array for a new ACTIVATE 1 to I/O 11

DRAM Access Latency Command Data ACTIVAT E 1 Activation latency 2 Precharge latency (13

DRAM Access Latency Command Data ACTIVAT E 1 Activation latency 2 Precharge latency (13 ns / 50 cycles) PRECHARG E READ 1111 Cache line (64 B) Duration Next ACT 12

Outline • • Executive Summary Motivation DRAM Background Characterization of DRAM – Experimental methodology

Outline • • Executive Summary Motivation DRAM Background Characterization of DRAM – Experimental methodology – Impact of voltage on reliability and latency • Voltron: DRAM Energy Reduction Mechanism • Conclusion 13

Supply Voltage Control on DRAM Module Supply Voltage Adjust the supply voltage to every

Supply Voltage Control on DRAM Module Supply Voltage Adjust the supply voltage to every chip on the same module 14

Custom Testing Platform Soft. MC [Hassan+, HPCA’ 17]: FPGA testing platform to 1) Adjust

Custom Testing Platform Soft. MC [Hassan+, HPCA’ 17]: FPGA testing platform to 1) Adjust supply voltage to DRAM modules 2) Schedule DRAM commands to DRAM modules Existing systems: DRAM commands not exposed to DRAM Voltage users FPGA module controller https: //github. com/CMU-SAFARI/DRAM-Voltage-Study 15

Tested DRAM Modules • 124 DDR 3 L (low-voltage) DRAM chips – – –

Tested DRAM Modules • 124 DDR 3 L (low-voltage) DRAM chips – – – 31 SO-DIMMs 1. 35 V (DDR 3 uses 1. 5 V) Density: 4 Gb per chip Three major vendors/manufacturers Manufacturing dates: 2014 -2016 • Iteratively read every bit in each 4 Gb chip under a wide range of supply voltage levels: 1. 35 V to 1. 0 V (-26%) 16

Outline • • Executive Summary Motivation DRAM Background Characterization of DRAM – Experimental methodology

Outline • • Executive Summary Motivation DRAM Background Characterization of DRAM – Experimental methodology – Impact of voltage on reliability and latency • Voltron: DRAM Energy Reduction Mechanism • Conclusion 17

Reliability Worsens with Lower Voltage Errors induced by reduced-voltage operation Min. voltage (Vmin) without

Reliability Worsens with Lower Voltage Errors induced by reduced-voltage operation Min. voltage (Vmin) without errors Nominal Voltage Reducing voltage below Vmin causes an increasing number of errors 18

Source of Errors Detailed circuit simulations (SPICE) of a DRAM cell array to model

Source of Errors Detailed circuit simulations (SPICE) of a DRAM cell array to model the behavior of DRAM operations https: //github. com/CMU-SAFARI/DRAM-Voltage-Study Circuit model Latency (ns) 20 Activate 15 Precharge Nominal Voltage 10 5 0. 9 1. 0 1. 1 1. 2 1. 3 Supply Voltage (V) Reliable low-voltage operation requires higher latency 19

DIMMs Operating at Higher Latency Measured Minimum Activate Latency (ns) Measured minimum latency that

DIMMs Operating at Higher Latency Measured Minimum Activate Latency (ns) Measured minimum latency that does not cause errors in DRAM mo 40% of modules 14 Distribution of latency in the total population 100% of modules 12 10 8 DRAM requires longer latency to access data without errors at lower voltage Lower bound of latency as our latency adjustment granularity is 2. 5 ns 20

Spatial Locality of Errors A module under 1. 175 V (12% voltage reduction) Errors

Spatial Locality of Errors A module under 1. 175 V (12% voltage reduction) Errors concentrate in certain regions 21

Other Results in the Paper • Error-Correcting Codes (ECC) – ECC (SECDED) is not

Other Results in the Paper • Error-Correcting Codes (ECC) – ECC (SECDED) is not sufficient to mitigate the errors • Effect of temperature – Higher temperature requires higher latency under some voltage levels • Data retention time – Lower voltage does not require more frequent refreshes • Effect of stored data pattern on error rate – Difference is not statistically significant to draw 22

Summary of Key Experimental Observations • Voltage-induced errors increase as voltage reduces further below

Summary of Key Experimental Observations • Voltage-induced errors increase as voltage reduces further below Vmin • Errors exhibit spatial locality • Increasing the latency of DRAM operations mitigates voltage-induced errors 23

Outline • • Executive Summary Motivation DRAM Background Characterization of DRAM – Experimental methodology

Outline • • Executive Summary Motivation DRAM Background Characterization of DRAM – Experimental methodology – Impact of voltage on reliability and latency • Voltron: DRAM Energy Reduction Mechanism • Conclusion 24

DRAM Voltage Adjustment to Reduce Energy • Goal: Exploit the trade-off between voltage and

DRAM Voltage Adjustment to Reduce Energy • Goal: Exploit the trade-off between voltage and latency to reduce energy consumption • Approach: Reduce DRAM voltage reliably Improvement Over Nominal Voltage (%) – Performance loss due to increased latency at lower voltage. Performance DRAM Power Savings 40 30 20 10 0 -10 -20 High Power Savings Bad Performance 0. 9 1. 0 1. 1 Low Power Savings Good Performance 1. 2 1. 3 Supply Voltage (V) 25

Voltron Overview Voltron User specifies the performance loss target Select the minimum DRAM voltage

Voltron Overview Voltron User specifies the performance loss target Select the minimum DRAM voltage without violating the target How do we predict performance loss due to increased latency under low DRAM voltage? 26

Linear Model to Predict Performance Voltron User specifies the performance loss target Application’s characteristics

Linear Model to Predict Performance Voltron User specifies the performance loss target Application’s characteristics Select the minimum DRAM voltage without violating the target [-1%, -3%, …] Predicted [1. 3 V, 1. 25 V, …] performance DRAM Voltage. Linear regression loss model Min. Voltag e Targe t Final Voltage 27

Linear Model to Predict Performance • Application’s characteristics for the model: – Memory intensity:

Linear Model to Predict Performance • Application’s characteristics for the model: – Memory intensity: Frequency of last-level cache misses – Memory stall time: Amount of time memory requests stall commit inside CPU • Handling multiple applications: – Predict a performance loss for each application – Select the minimum voltage that satisfies the performance target for all applications 28

Comparison to Prior Work • Prior work: Dynamically scale frequency and voltage of the

Comparison to Prior Work • Prior work: Dynamically scale frequency and voltage of the entire DRAM based on bandwidth demand [David+, ICAC’ 11] – Problem: Lowering voltage on the peripheral circuitry decreases channel frequency (memory data throughput) • Voltron: Reduce voltage to only DRAM array without DRAM Peripheral changing the voltage to peripheral circuitry Array Circuitry Control Low Voltage Logic Bank I/O Off-chip channel Low frequency Control Logic Prior Work Low Bank Voltage Off-chip channel High frequency Voltron 29

Exploiting Spatial Locality of Errors Key idea: Increase the latency only for DRAM banks

Exploiting Spatial Locality of Errors Key idea: Increase the latency only for DRAM banks that observe errors under low voltage – Benefit: Higher performance Peripheral Circuitry Control Logic I/O DRAM Array Bank 0 Bank 1 Bank 2 Off-chip channel High latency Low latency 30

Outline • • Executive Summary Motivation DRAM Background Characterization of DRAM – Experimental methodology

Outline • • Executive Summary Motivation DRAM Background Characterization of DRAM – Experimental methodology – Impact of voltage on reliability and latency • Voltron: DRAM Energy Reduction Mechanism – Evaluation • Conclusion 31

Voltron Evaluation Methodology • Cycle-level simulator: Ramulator [CAL’ 15] – Mc. PAT and DRAMPower

Voltron Evaluation Methodology • Cycle-level simulator: Ramulator [CAL’ 15] – Mc. PAT and DRAMPower for energy measurement https: //github. com/CMU-SAFARI/ramulator • 4 -core system with DDR 3 L memory • Benchmarks: SPEC 2006, YCSB • Comparison to prior work: Mem. DVFS [David+, ICAC’ 11] – Dynamic DRAM frequency and voltage scaling – Scaling based on the memory bandwidth consumption 32

Energy Savings with Bounded Performance Mem. DVFS Voltron Meets performance target [David+, ICAC’ 11]

Energy Savings with Bounded Performance Mem. DVFS Voltron Meets performance target [David+, ICAC’ 11] CPU+DRAM Energy Savings (%) Performance Loss (%) 0 8 7. 3% 7 -1 More savings for 6 high bandwidth -2 5 applications 1. 6% 1. 8% -3 4 3. 2% 3 -4 2 1. Voltron improves energy for both low and high -5 1 Performance Target intensity workloads -6 0 Low High 2. Voltron satisfies the performance loss target via a Memory Intensity regression model 33

Outline • • Executive Summary Motivation DRAM Background Characterization of DRAM – Experimental methodology

Outline • • Executive Summary Motivation DRAM Background Characterization of DRAM – Experimental methodology – Impact of voltage on reliability and latency • Voltron: DRAM Energy Reduction Mechanism • Conclusion 34

Conclusion • DRAM (memory) power is significant in today’s systems – Existing low-voltage DRAM

Conclusion • DRAM (memory) power is significant in today’s systems – Existing low-voltage DRAM reduces voltage conservatively • Goal: Understand exploit the reliability and latency behavior of real DRAM chips under aggressive reducedvoltage operation • Key experimental observations: – Errors occur and increase with lower voltage – Errors exhibit spatial locality – Higher operation latency mitigates voltage-induced errors • Voltron: A new DRAM energy reduction mechanism – Reduce DRAM voltage without introducing errors – Use a regression model to select voltage that does not degrade performance beyond a chosen target 7. 3% system 35

Understanding Reduced-Voltage Operation in Modern DRAM Devices Experimental Characterization, Analysis, and Mechanisms Kevin Chang†

Understanding Reduced-Voltage Operation in Modern DRAM Devices Experimental Characterization, Analysis, and Mechanisms Kevin Chang† A. Giray Yaglikci†, Saugata Ghose†, Aditya Agrawal*, Niladrish Chatterjee*, Abhijith Kashyap†, Donghyuk Lee*, Mike O’Connor*, Hasan Hassan‡, Onur Mutlu†‡ † * ‡

BACKUP 37

BACKUP 37

Errors Rates Across Modules 38

Errors Rates Across Modules 38

Error Density 39

Error Density 39

Temperature Impact 40

Temperature Impact 40

Impact on Retention Time 41

Impact on Retention Time 41

Derivation of More Precise Latency Circuit-level SPICE simulation Potential latency range DRAM circuit model

Derivation of More Precise Latency Circuit-level SPICE simulation Potential latency range DRAM circuit model validates our experimental results and provides more precise latency 42

Performance Loss Correlation Performance Degradation (%) • Observation: Application’s performance loss due to higher

Performance Loss Correlation Performance Degradation (%) • Observation: Application’s performance loss due to higher latency has a strong linear relationship with its memory intensity Voltage = 1. 2 V (-13%) DRAM Latency +5% Memory Intensity (MPKI) Voltage = 1. 1 V (-23%) DRAM Latency +13% Memory Intensity (MPKI) MPKI = Last-level cache Misses Per Thousand Instruction 43

Performance-Aware Voltage Adjustment • Build a performance (linear-regression) model to predict performance loss based

Performance-Aware Voltage Adjustment • Build a performance (linear-regression) model to predict performance loss based on the selected voltage Latency due to voltage adjustment The running application’s characteristics • �� s are trained through 151 application samples • Use the model to select a minimum voltage that satisfies a performance loss target specified by the user 44

Linear Model Accuracy • R 2 = 0. 75 / 0. 9 for low

Linear Model Accuracy • R 2 = 0. 75 / 0. 9 for low and high intensity workloads • RMSE = 2. 8 / 2. 5 for low and high intensity workloads 45

Dynamic Voltron 46

Dynamic Voltron 46

Effect of Exploiting Error Locality 47

Effect of Exploiting Error Locality 47

Energy Breakdown 48

Energy Breakdown 48

Heterogeneous Workloads 49

Heterogeneous Workloads 49

Performance Target Sweep 50

Performance Target Sweep 50

Sensitivity to Profile Interval Length 51

Sensitivity to Profile Interval Length 51