Mobile System Considerations for SDRAM Interface Trends Andrew
Mobile System Considerations for SDRAM Interface Trends Andrew B. Kahng†‡, Vaishnav Srinivas‡¥ June 5 th, 2011 CSE† and ECE‡ Departments University of California, San Diego Qualcomm Inc. ¥
Outline • SDRAM Memory Interfaces: Today and Tomorrow • Motivation • Trends in DRAM Density and Data Rate • Trends in Mobile Processor Requirements • Memory Interface Calculator • Exploration Using the Calculator • Summary and Next Steps (2/13)
SDRAM Memory Interfaces Today and Tomorrow • Various interconnect and signaling options exist: o Interconnect: Die stack/MCP POP DIMM 3 D-Stack o Signaling: DDR, XDR, Serial, Wide IO • Exploration of these options based on the primary bounds (Capacity, Throughput, Power and Latency) is required for making the correct tradeoffs (3/13)
Motivation • The memory interface calculator includes: o IO switching, bias and termination power o IO/PHY and interconnect latencies o Input parameters for exploration: • Termination values • Loading • Number of data and strobe pins • Memory timing parameters • IO/PHY “retiming” power • Predict gaps between offerings and requirements • Integrating into CACTI can help exploration of system metrics (4/13)
Trends in DRAM Capabilities • DRAM densities to double every 3 years • Projections for DRAM densities revised downwards over time • Current densities at 4 Gb/die • DRAM data rates to double every 4 -5 years • Projections for DRAM data rates revised upwards over time • Current data-rates at 2. 2 Gb/s (5/13)
Trends in Mobile Processor Requirements • Trends for mobile processor requirements o Capacity to scale 3 -4 x every 3 years o Throughput to double every 3 years • The requirements are very dynamic! • Quick exploration and projection for compatible memories is useful 20 18 16 Market 2010 2011 2012 2013 2014 Desktop 3. 0 4. 2 5. 6 7. 4 10. 2 Laptop 2. 0 3. 3 4. 6 6. 3 8. 0 Mobile 0. 3 0. 5 0. 8 1. 0 1. 3 14 12 10 8 6 4 Capacity Requirements in GB (Source: IDC) 2 0 2009 2011 2013 2015 2017 Mobile Handset Throughput Requirements in GB/s (Source: Qualcomm) (6/13)
Memory Interface Calculator Primary Bound Capacity Throughput Power Latency Parameters affected Number of ranks and channels Memory Density Capacitive loading Data-rate, number of data lanes Timing parameters Signal Integrity skew and jitter Termination scheme Supply voltage Activity factor Number of pipeline stages Interconnect delay Memory access time (7/13)
Memory Interface Calculator Summary Bound Clock Speed (MHz) Throughput (GB/s) LPDDR 2 TSS-Wide IO DDR 3 Serial Mobile-XDR 300 -533, 200 -333, SDR 400 -800, DDR 4 -8 GHz, Serial 400 -533, Octal DDR 3 -4. 3 12 -24 6 -13 12 -17 ~40 ~120 ~60 ~20 ~50 ~35 ~100 ~50 ~90 ~45 ~220 ~110 ~70 Active Idle IO Power (m. W) ~6 -10 ~2 -4 ~500 -600 ~450 ~200 Active Idle Core Power (m. W) ~20 ~150 ~20 Peak IO Power Efficiency (m. W/GBps) Peak Core Power Efficiency (m. W/GBps) Total Peak Power Efficiency (m. W/GBps) Capacity (GB) (Current trends) Latency from MC-DRAM-MC 0. 5 -1 for 0. 5 -2 through x 32 multi-die stacking dual rank ~50 ns ~40 ns 2 -8 for dual-rank DIMM 0. 5 -1 for x 32 dual rank ~45 ns, but ~65 ns, PLL lock ~60 ns, DLL penalty if DLL is penalty if off off (~512 Tck) (8/13)
Memory Interface Calculator Summary • The spider chart highlights the design space covered o Wide IO covers the largest space for lower capacities o Large capacity systems still need DDR 3/DDR 4 • Alternatives to be explored outside the existing space? 30 30 25 25 20 20 LPDDR 2 (2 x 32) Wide. IO (4 x 128) Serial (x 32) Mobile Req 15 15 10 10 55 00 2009 2011 2013 2015 2017 Memory Interface Design Space Throughput (2, 25) Max LPDDR 2 DDR 3 Power Efficiency (0. 002, 0. 04) LPDDR 3 Capacity (0, 8) DDR 4 M-XDR Serial Wide IO 1/Latency (0. 01, 0. 04) • Before LPDDR 3 came up in JEDEC, Wide-IO and Serial Memory were being explored. • LPDDR 3 was brought up as a way to fill this gap in 2012 -2014 timeframe Throughput in GB/s (9/13)
Exploration using the calculator • How fast can LPDDR 3 operate? o With terminations? o With DLL/better retiming? o With lower loading? o With better packaging? o POP versus MCP • Wide IO exploration? o Transition to DDR for Wide IO? o Number of data lanes per strobe – 8, 16 or 32? o When does interface timing and signal/power integrity become an issue for Wide IO? • High-capacity memory alternatives to DDR 3/DDR 4? o MCP with larger number of wire-bonded dies? o TSS with large number of stacks (8? ) o TSS-MCP if stacking with processor is a thermal risk? (10/13)
LPDDR 3 Exploration Inputs to the calculator Value Number of memories on data pin 1 Number of memories on add pin 1 Number of memories on clk pin 1 Frequency of clock Units 1250 MHz Retiming current 25 m. A Number of data pins 32 Number of DQS pairs 8 Termination RTT on DQ & DQS 60 ohms Termination RTT on CA 60 ohms Memory density for each memory core 4 Gb TDS 100 ps TDH 100 ps TDQSQ 100 ps TQHS 100 ps Outputs of the calculator Signal Swing on DQ&DQS, Vsw. DQ Switching Power on DQS Switching Power on CLK + CLK diff termination Bias and Static Power Signal Swing on CA, Vsw. DQ Switching Power on CA Termination Power I/O power for CPU chip Throughput Capacity Latency Tskew Tjitter Terror Timing margin WRITE Timing margin READ Value Units 0. 80 52. 80 (V) (m. W) 12. 78 30. 00 0. 65 19. 24 225. 45 393. 07 10 0. 5 38. 6 41 29 20 60 -5 (m. W) (V) (m. W) GB/s GB ns ps ps ps (11/13)
LPDDR 3 Exploration Maximum speeds for: Preliminary Answers from the calculator POP, Unterminated LPDDR 3 with ~150 ps memory timing parameters (t. DS/t. DH/t. DQSQ/t. QHS)? 800 MHz for single-rank 800 MHz for dual-rank will need careful architecture and design POP, Terminated LPDDR 3 with ~100 ps memory timing parameters? 1250 MHz External (MCP), Unterminated LPDDR 3 Even 533 MHz for dual-rank is challenging and may need sophisticated retiming External (MCP), Terminated LPDDR 3? 1066 MHz (12/13)
Summary and Next Steps • A simple framework to model interconnect and IO/PHY timing and power for existing and upcoming SDRAM memory interfaces • Helps explore standards and design space • Helps identify gaps between DRAM and SOCs • Next Steps: o Integrate the memory interface models within CACTI o Challenge the calculator future usage cases for mobile products o Include more parameters, including silicon area, packaging options and number of data lanes per strobe pin (13/13)
- Slides: 13