Threshold Voltage Distribution in MLC NAND Flash Characterization
Threshold Voltage Distribution in MLC NAND Flash: Characterization, Analysis, and Modeling Yu Cai 1, Erich F. Haratsch 2, Onur Mutlu 1, and Ken Mai 1 1. DSSC, ECE Department, Carnegie Mellon University 2. LSI Corporation 3/20/2013
Evolution of NAND Flash Memory § § Aggressive scaling MLC technology Increasing capacity Acceptable low cost High speed Low power consumption Compact physical size E. Grochowski et al. , “Future technology challenges for NAND flash and HDD products”, Flash Memory Summit 2012 2
Challenges: Reliability and Endurance § P/E cycles (required) Complete write of drive 10 times per day for 5 years (STEC) > 50 k P/E cycles § P/E cycles (provided) A few thousand E. Grochowski et al. , “Future technology challenges for NAND flash and HDD products”, Flash Memory Summit 2012 3
Solutions: Future NAND Flash-based Storage Architecture Noisy Memory Signal Processing Raw Bit Error Rate • Read voltage adjusting • Data scrambler • Data recovery • Shadow program Error Correction BER < 10 -15 • BCH codes • Reed-Solomon codes • LDPC codes • Other Flash friendly codes Need to understand NAND Flash Error Patterns/Channel Model Need to design efficient DSP/ECC and smart error management 4
NAND Flash Channel Modeling Write (Tx) Noisy NAND Read (Rx) Simplified NAND Flash channel model based on dominant errors Write Additive White Gaussian Noise § Erase operation § Program page operation Cell-to-Cell Interference § Neighbor page program 5 Time-variant Retention § Retention Read
Testing Platform USB Board PCI-e Board HAPS-52 Motherboard Virtex-5 FPGA (NAND Controllers) Flash Board 6 Flash Chip
Characterizing Cell Threshold w/ Read Retry Erased State Programmed States #cells 11 REF 1 0 V REF 2 REF 3 P 1 P 2 10 00 01 00 i-2 i-1 i i+1 i+2 P 3 01 Read Retry § Read-retry feature of new NAND flash § Tune read reference voltage and check which Vth region of cells § Characterize threshold voltage distribution of flash cells in programmed states through Monte-Carlo emulation 7 Vth
Programmed State Analysis P 3 State P 2 State P 1 State 8
Parametric Distribution Learning § Parametric distribution § Closed-formula, only a few number of parameters to be stored § Exponential distribution family Distribution parameter vector § Maximum likelihood estimation (MLE) to learn parameters Observed testing data Likelihood Function Goal of MLE: Find distribution parameters to maximize likelihood function 9
Selected Distributions 10
Distribution Exploration P 1 State RMSE P 2 State P 3 State Beta Gamma Gaussian Log-normal Weibull 19. 5% 20. 3% 22. 1% 24. 8% 28. 6% Distribution can be approx. modeled as Gaussian distribution 11
Noise Analysis § Signal and additive noise decoupling § Power spectral density analysis of P/E noise Flat in frequency domain § Auto-correlation analysis of P/E noise Spike at 0 -lag point in time domain 12 Approximately can be modeled as white noise
Independence Analysis over Space § Correlations among cells in different locations are low (<5%) § P/E operation can be modeled as memory-less channel § Assuming ideal wear-leveling 13
Independence Analysis over P/E cycles § High correlation btw threshold in same location under P/E cycles § Programming to same location modeled as channel w/ memory 14
Cycling Noise Analysis As P/E cycles increase. . . §Distribution shifts to the right §Distribution becomes wider 15
Cycling Noise Modeling Mean value (µ) increases with P/E cycles Exponential model Standard deviation value (σ) increases with P/E cycles Linear model 16
SNR Analysis § SNR decreases linearly with P/E cycles § Degrades at ~ 0. 13 d. B/1000 P/E cycles 17
Conclusion & Future Work § P/E operations modeled as signal passing thru AWGN channel § Approximately Gaussian with 22% distortion § P/E noise is white noise § P/E cycling noise affects threshold voltage distributions § Distribution shifts to the right and widens around the mean value § Statistics (mean/variance) can be modeled as exponential correlation with P/E cycles with 95% accuracy § Future work § Characterization and models for retention noise § Characterization and models for program interference noise 18
Backup Slides 19
Hard Data Decoding § Read reference voltage can affect the raw bit error rate f(x) g(x) Vth v 0 vref Vth v 1 v 0 v’ref v 1 § There exists an optimal read reference voltage § Optimal read reference voltage is predictable § Distribution sufficient statistics are predictable (e. g. mean, variance) 20
Soft Data Decoding § Estimate soft information for soft decoding (e. g. LDPC codes) f(x) log likelihood ratio (LLR) g(x) Vth v 0 vref v 1 High Confidence Low Confidence § Closed-form soft information for AWGN channel § Assume same variance to show a simple case 21 Sensed threshold voltage range
Non-Parametric Distribution Learning § Non-parametric distribution Kernel Function § Histogram estimation Volume of a hypercube Count the number of K of of side h in D dimensions points falling within the h region § Kernel density estimation Smooth Gaussian Kernel Function § Summary § § Pros: Accurate model with good predictive performance Cons: Too complex, too many parameters need to be stored 22
Probability Density Function (PDF) P 1 State P 2 State P 3 State § Probability density function (PDF) of NAND flash memory estimation using non-parametric kernel density methodology 23
- Slides: 23