PS Soft MC Understanding and Improving Modern DRAM

  • Slides: 48
Download presentation
P&S Soft. MC Understanding and Improving Modern DRAM Performance, Reliability, and Security with Hands-On

P&S Soft. MC Understanding and Improving Modern DRAM Performance, Reliability, and Security with Hands-On Experiments Hasan Hassan Prof. Onur Mutlu ETH Zürich Fall 2020 7 October 2020

P&S Soft. MC: Content n n n We will learn in detail how modern

P&S Soft. MC: Content n n n We will learn in detail how modern DDR 4 DRAM operates You will learn how to characterize DRAM using an FPGA-based DRAM characterization infrastructure (Soft. MC) You will use Soft. MC to develop your own DRAM experiments and gain handon experience in studying DRAM characteristics 2

P&S Soft. MC: Key Takeaways n This P&S is aimed at improving your q

P&S Soft. MC: Key Takeaways n This P&S is aimed at improving your q Knowledge in Computer Architecture and Memory Systems q Technical skills in running DRAM experiments using real devices q Critical thinking and analysis q Interaction with a nice group of researchers q Familiarity with key research directions q Technical presentation of your project 3

P&S Soft. MC: Key Goal (Learn how to) study real memory devices using an

P&S Soft. MC: Key Goal (Learn how to) study real memory devices using an FPGA-based DRAM infrastructure to gain new insights on DRAM behavior 4

Prerequisites of the Course n Digital Design and Computer Architecture (or equivalent course) n

Prerequisites of the Course n Digital Design and Computer Architecture (or equivalent course) n Familiarity with FPGA programming n Interest in low-level hacking and memory n Interest in discovering why things do or do not work and solving problems 5

Course Info: Who Are We? (I) n Onur Mutlu q q q n Full

Course Info: Who Are We? (I) n Onur Mutlu q q q n Full Professor @ ETH Zurich ITET (INFK), since September 2015 Strecker Professor @ Carnegie Mellon University ECE/CS, 2009 -2016, 2016 -… Ph. D from UT-Austin, worked at Google, VMware, Microsoft Research, Intel, AMD https: //people. inf. ethz. ch/omutlu/ omutlu@gmail. com (Best way to reach me) https: //people. inf. ethz. ch/omutlu/projects. htm Research and Teaching in: q q q q Computer architecture, computer systems, hardware security, bioinformatics Memory and storage systems Hardware security, safety, predictability Fault tolerance Hardware/software cooperation Architectures for bioinformatics, health, medicine … 6

Course Info: Who Are We? (II) n Lead Supervisor: q n Supervisors: q q

Course Info: Who Are We? (II) n Lead Supervisor: q n Supervisors: q q n Hasan Hassan Jeremie Kim Lois Orosa Minesh Patel Giray Yaglikci Get to know us and our research q https: //safari. ethz. ch/safari-group/ 7

Onur Mutlu’s SAFARI Research Group Computer architecture, HW/SW, systems, bioinformatics, https: //safari. ethz. ch/safari-newsletter-april-2020/

Onur Mutlu’s SAFARI Research Group Computer architecture, HW/SW, systems, bioinformatics, https: //safari. ethz. ch/safari-newsletter-april-2020/ security, memory 38+ Researchers Think BIG, Aim HIGH! https: //safari. ethz. ch

Current Research Focus Areas Research Focus: Computer architecture, HW/SW, bioinformatics • Memory and storage

Current Research Focus Areas Research Focus: Computer architecture, HW/SW, bioinformatics • Memory and storage (DRAM, flash, emerging), interconnects • Heterogeneous & parallel systems, GPUs, systems for data analytics • System/architecture interaction, new execution models, new interfaces • Energy efficiency, fault tolerance, hardware security, performance Hybrid Main Memory • Genome sequence analysis & assembly algorithms and architectures Persistent Memory/Storage Heterogeneous • Biologically inspired systems & system design for Processors and Broad research bio/medicine Accelerators spanning apps, systems, logic with architecture at the center Graphics and Vision Processing 9

Course Info: How About You? n Let us know your background, interests n Why

Course Info: How About You? n Let us know your background, interests n Why did you join this P&S? n Please submit HW 0 10

Course Requirements and Expectations n Attendance required for all meetings n Study the learning

Course Requirements and Expectations n Attendance required for all meetings n Study the learning materials n Each student will carry out a hands-on project q n Build, implement, code, and design with close engagement from the supervisors Participation q q Ask questions, contribute thoughts/ideas Read relevant papers We will help in all projects! If your work is really good, you may get it published! 11

Course Website n https: //safari. ethz. ch/projects_and_seminars/doku. php? id=softmc n Useful information about the

Course Website n https: //safari. ethz. ch/projects_and_seminars/doku. php? id=softmc n Useful information about the course n Check your email frequently for announcements 12

Meeting 1 n Required materials: Soft. MC Tutorial Video: https: //youtu. be/909 u. TQu

Meeting 1 n Required materials: Soft. MC Tutorial Video: https: //youtu. be/909 u. TQu 0 lb. A Soft. MC lecture: https: //www. youtube. com/watch? v=tn. SPEP 3 t-Ys Paper describing Soft. MC: https: //people. inf. ethz. ch/omutlu/pub/soft. MC_hpca 17. pdf Example Row. Hammer study using Soft. MC: https: //people. inf. ethz. ch/omutlu/pub/Revisiting-Row. Hammer_isca 20. pdf n Recommended materials: Example security attack study using Soft. MC: https: //people. inf. ethz. ch/omutlu/pub/rowhammer-TRRespass_ieee_security_privacy 20. pdf Example neural network acceleration study using Soft. MC: https: //people. inf. ethz. ch/omutlu/pub/EDEN-efficient-DNN-inference-with-approximate-memory_micro 19. pdf Example random number generation study using Soft. MC: https: //people. inf. ethz. ch/omutlu/pub/drange-dram-latency-based-true-random-number-generator_hpca 19. pdf Example physical unclonable function study using Soft. MC: https: //people. inf. ethz. ch/omutlu/pub/dram-latency-puf_hpca 18. pdf The original Row. Hammer study using Soft. MC: https: //people. inf. ethz. ch/omutlu/pub/dram-row-hammer_isca 14. pdf 13

Meeting 2 (October 15 th) n We will announce the projects and will give

Meeting 2 (October 15 th) n We will announce the projects and will give you some description about them n We will give you a chance to select a project n n Then, we will have 1 -1 meetings to match your interests, skills, and background with a suitable project It is important that you study the learning materials before our next meeting! 14

Next Meetings n Individual meetings with your mentor/s n Tutorials and short talks q

Next Meetings n Individual meetings with your mentor/s n Tutorials and short talks q q n DRAM Characterization and Soft. MC Recent research works Presentation of your work 15

An Introduction to DRAM and Soft. MC 16

An Introduction to DRAM and Soft. MC 16

Soft. MC A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies Hasan

Soft. MC A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies Hasan Hassan, Nandita Vijaykumar, Samira Khan, Saugata Ghose, Kevin Chang, Gennady Pekhimenko, Donghyuk Lee, Oguz Ergin, Onur Mutlu HPCA 2017 17

Executive Summary n n n Two critical problems of DRAM: Reliability and Performance q

Executive Summary n n n Two critical problems of DRAM: Reliability and Performance q Recently-discovered bug: Row. Hammer Characterize, analyze, and understand DRAM cell behavior We design and implement Soft. MC, an FPGA-based DRAM testing infrastructure q Flexible and Easy to Use (C++ API) q Open-source (github. com/CMU-SAFARI/Soft. MC) We implement two use cases q A retention time distribution test q An experiment to validate two latency reduction mechanisms Soft. MC enables a wide range of studies 18

Outline 1. DRAM Basics & Motivation 2. Soft. MC 3. Use Cases – Retention

Outline 1. DRAM Basics & Motivation 2. Soft. MC 3. Use Cases – Retention Time Distribution Study – Evaluating Recently-Proposed Ideas 4. Future Research Directions 5. Conclusion 19

DRAM Operations DRAM Cell Memory Bus DRAM Row Memory Activate Precharge Read Controller CPU

DRAM Operations DRAM Cell Memory Bus DRAM Row Memory Activate Precharge Read Controller CPU Sense Amplifier 20

DRAM Latency DRAM Cell Sense Amplifier 0 (refresh) 64 ms time Activate Read Precharge

DRAM Latency DRAM Cell Sense Amplifier 0 (refresh) 64 ms time Activate Read Precharge Activate Ready-to-access Latency. The interval during which Precharge Retention Time: the data Latency is retained correctly in the DRAM cell without accessing it Activation Latency 21

Latency vs. Reliability DRAM Cell Sense Amplifier time Activate Read Precharge Activate Ready-to-access Latency

Latency vs. Reliability DRAM Cell Sense Amplifier time Activate Read Precharge Activate Ready-to-access Latency Precharge Violating latencies negatively Latency Activation Latency reliability affects DRAM 22

Other Factors Affecting Reliability and Latency n n n Temperature Voltage Inter-cell Interference Manufacturing

Other Factors Affecting Reliability and Latency n n n Temperature Voltage Inter-cell Interference Manufacturing Process Retention. To Time develop … new mechanisms improving reliability and latency, we need to better understand the effects of these factors 23

Characterizing DRAM Many of the factors affecting DRAM reliability and latency cannot be properly

Characterizing DRAM Many of the factors affecting DRAM reliability and latency cannot be properly modeled We need to perform experimental studies of real DRAM chips 24

Outline 1. DRAM Basics & Motivation 2. Soft. MC 3. Use Cases – Retention

Outline 1. DRAM Basics & Motivation 2. Soft. MC 3. Use Cases – Retention Time Distribution Study – Evaluating Recently-Proposed Ideas 4. Future Research Directions 5. Conclusion 25

Goals of a DRAM Testing Infrastructure n Flexibility q q n Ability to test

Goals of a DRAM Testing Infrastructure n Flexibility q q n Ability to test any DRAM operation Ability to test any combination of DRAM operations and custom timing parameters Ease of use q q q Simple programming interface (C++) Minimal programming effort and time Accessible to a wide range of users n who may lack experience in hardware design 26

Soft. MC: High-level View FPGA-based memory characterization infrastructure Prototype using Xilinx ML 605 Easily

Soft. MC: High-level View FPGA-based memory characterization infrastructure Prototype using Xilinx ML 605 Easily programmable using the C++ API 27

Soft. MC: Key Components 1. Soft. MC API 2. PCIe Driver 3. Soft. MC

Soft. MC: Key Components 1. Soft. MC API 2. PCIe Driver 3. Soft. MC Hardware 28

Soft. MC API (Old) Writing data to DRAM: Instruction. Sequence iseq; iseq. insert(gen. ACT(bank,

Soft. MC API (Old) Writing data to DRAM: Instruction. Sequence iseq; iseq. insert(gen. ACT(bank, row)); iseq. insert(gen. WAIT(t. RCD)); iseq. insert(gen. WR(bank, col, data)); iseq. insert(gen. WAIT(t. CL + t. BL + t. WR)); iseq. insert(gen. PRE(bank)); iseq. insert(gen. WAIT(t. RP)); iseq. insert(gen. END()); iseq. execute(fpga); Instruction generator functions 29

Soft. MC API (New) n Soft. MCPlatform: q q n Program: q q q

Soft. MC API (New) n Soft. MCPlatform: q q n Program: q q q n execute() – starts execution of a Soft. MC program receive. Data() – gets data from PCIe add_inst() – adds an instruction add_branch() – adds a branch instruction add_label() – adds a branch target Soft. MC Instructions: q q q Arithmetic & Logic: AND, OR, XOR, ADD, SUB, LI, MV, … Scratchpad Memory: LD, ST DRAM Commands: ACT, PRE, READ, WRITE, REF, … 30

Soft. MC: Key Components 1. Soft. MC API 2. PCIe Driver* Communicates raw data

Soft. MC: Key Components 1. Soft. MC API 2. PCIe Driver* Communicates raw data with the FPGA 3. Soft. MC Hardware * Jacobsen, Matthew, et al. "RIFFA 2. 1: A reusable integration framework for FPGA accelerators. " TRETS, 2015 31

Soft. MC Hardware (Old) Instruction Receiver Host Machine PCIe Controller Instructions Instruction Queue Autorefresh

Soft. MC Hardware (Old) Instruction Receiver Host Machine PCIe Controller Instructions Instruction Queue Autorefresh Controller Instruction Dispatcher Wait (Ready-to-access Latency) Activate Read DRAM DDR PHY Calibration Controller Read Capture Data Soft. MC Hardware (FPGA) 32

Soft. MC Program Simple Processor Host Machine PCIe Controller Instructions Program Autorefresh Controller PC

Soft. MC Program Simple Processor Host Machine PCIe Controller Instructions Program Autorefresh Controller PC Instruction Dispatcher Soft. MC Hardware (New) DRAM DDR PHY Calibration Controller Read Capture Soft. MC Hardware (FPGA) 33

Outline 1. DRAM Basics & Motivation 2. Soft. MC 3. Use Cases – Retention

Outline 1. DRAM Basics & Motivation 2. Soft. MC 3. Use Cases – Retention Time Distribution Study – Evaluating Recently-Proposed Ideas 4. Future Research Directions 5. Conclusion 34

Retention Time Distribution Study Write Reference Data to a Row Wait (Refresh Interval) Read

Retention Time Distribution Study Write Reference Data to a Row Wait (Refresh Interval) Read Back Observe Errors Increase the refresh interval Can be implemented with just ~100 lines of code 35

Number of Erroneous Bytes Retention Time Test: Results 8000 6000 4000 @ ~20⁰C (room

Number of Erroneous Bytes Retention Time Test: Results 8000 6000 4000 @ ~20⁰C (room temperature) Module A Module B Module C 2000 Validates the correctness of the Soft. MC 0 0 1 2 3 4 5 6 7 8 Infrastructure Refresh Interval (s) 36

Outline 1. DRAM Basics & Motivation 2. Soft. MC 3. Use Cases – Retention

Outline 1. DRAM Basics & Motivation 2. Soft. MC 3. Use Cases – Retention Time Distribution Study – Evaluating Recently-Proposed Ideas 4. Future Research Directions 5. Conclusion 37

Accessing Highly-charged Cells Faster NUAT (Shin+, HPCA 2014) Charge. Cache (Hassan+, HPCA 2016) A

Accessing Highly-charged Cells Faster NUAT (Shin+, HPCA 2014) Charge. Cache (Hassan+, HPCA 2016) A highly-charged cell can be accessed with low latency 38

How a Highly-Charged Cell Is Accessed Faster? 0 (refresh) 64 ms DRAM Cell Sense

How a Highly-Charged Cell Is Accessed Faster? 0 (refresh) 64 ms DRAM Cell Sense Amplifier time Activate Read Precharge Ready-to-access Latency Activation Latency Activate Precharge Latency 39

Ready-to-access Latency Test Longer wait Lower cell charge Shorter wait Higher cell charge Write

Ready-to-access Latency Test Longer wait Lower cell charge Shorter wait Higher cell charge Write Reference Data Wait for the Wait Interval Read Back Observe Errors Change the Wait Interval With custom ready-toaccess latency parameter Can be implemented with just ~150 lines of code 40

500 400 300 8 Number of Erroneous Bytes Expected Curves Real Curves Latency (cycles)

500 400 300 8 Number of Erroneous Bytes Expected Curves Real Curves Latency (cycles) 6 6 5 5 4 4 3 3 @ 80⁰C temperature We do not observe 100 the expected 0 latency reduction effect Refresh Interval Wait Interval (ms) in existing DRAM chips 200 32 56 80 104 128 152 176 200 224 248 272 296 320 344 368 392 416 440 464 488 Number of Erroneous Bytes Ready-to-access Latency: Results 41

Why Don’t We See the Latency Reduction Effect? The memory controller cannot externally control

Why Don’t We See the Latency Reduction Effect? The memory controller cannot externally control when a sense amplifier gets enabled in existing DRAM chips Ready to Access charge n Cell Ready to Access Charge Level Data 1 Potential Enabling the Reduction Sense Amplifier Fixed Latency! ACT R/W Data 0 time 42

Outline 1. DRAM Basics & Motivation 2. Soft. MC 3. Use Cases – Retention

Outline 1. DRAM Basics & Motivation 2. Soft. MC 3. Use Cases – Retention Time Distribution Study – Evaluating Recently-Proposed Ideas 4. Future Research Directions 5. Conclusion 43

Future Research Directions n More Characterization of DRAM q How are the cell characteristics

Future Research Directions n More Characterization of DRAM q How are the cell characteristics changing with different generations of technology nodes? q What types of usage accelerate aging? n Characterization of Non-volatile Memory n Extensions q q q Memory Scheduling Workload Analysis Testbed for in-memory Computation 44

Outline 1. DRAM Basics & Motivation 2. Soft. MC 3. Use Cases – Retention

Outline 1. DRAM Basics & Motivation 2. Soft. MC 3. Use Cases – Retention Time Distribution Study – Evaluating Recently-Proposed Ideas 4. Future Research Directions 5. Conclusion 45

Conclusion n n Soft. MC: First publicly-available FPGA-based DRAM testing infrastructure Flexible and Easy

Conclusion n n Soft. MC: First publicly-available FPGA-based DRAM testing infrastructure Flexible and Easy to Use Implemented two use cases q Retention Time Distribution Study q Evaluation of two recently-proposed latency reduction mechanisms Soft. MC can enable many other studies, ideas, and methodologies in the design of future memory systems Download our first prototype github. com/CMU-SAFARI/Soft. MC 46

Soft. MC A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies Hasan

Soft. MC A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies Hasan Hassan, Nandita Vijaykumar, Samira Khan, Saugata Ghose, Kevin Chang, Gennady Pekhimenko, Donghyuk Lee, Oguz Ergin, Onur Mutlu HPCA 2017 47

P&S Soft. MC Understanding and Improving Modern DRAM Performance, Reliability, and Security with Hands-On

P&S Soft. MC Understanding and Improving Modern DRAM Performance, Reliability, and Security with Hands-On Experiments Hasan Hassan Prof. Onur Mutlu ETH Zürich Fall 2020 7 October 2020