Combating Bit Errors From Stuck Cells in Flash
Combating Bit Errors From Stuck Cells in Flash Memory Using Novel Information Theory Techniques Ravi Motwani, Zion Kwok, Poovaiah Palangappa 1 NVMW 2018
OUTLINE • Problem: • • Stuck cells are bad for LDPC codes Read-time solutions are bad for Qo. S • Solution: • Write-time data encoding for handling stuck cells • Results
STUCK CELLS IMPACT HIGH CONFIDENCE BUCKET Low Confidence 1 Medium Confidence 1 High Confidence 1 L 0 Logical bit 1 Low Confidence 0 Medium Confidence 0 High Confidence 0 L 1 Logical bit 0 Threshold Voltage Stuck cells are High Confidence 0 s
IMPACT OF STUCK CELLS IN 3 D NAND • SBR performance degradation
EXISTING SOLUTIONS IN LITERATURE • By code construction • • Heegard et. al. Overhead increases as probability of opens increases • PAYG, ECP, etc. • • • Handling stuck cells by storing error correction pointers Dynamically allocated Qo. S impact due to reads from another NAND die/memory
HANDLING OPENS DURING READ Vwlrvmax • Assume a SDD ECC fatal • Perform a SDD decode with extra sensing at Vwlrvmax • Open circuits will read as very high confidence logical-0 (VHC 0) bits • LDPC decoder considers the VHC 0 as erasures L 0 L 1
ERRORS AND ERASURES DECODING OF LDPC CODES • Errors and erasures decoding gives close to regular SDD performance • Impacts the Qo. S due to the special SDD read with extra sensing
ERRORS AND ERASURES DECODING OF LDPC CODES • Errors and erasures decoding gives close to regular Can we handle stuck cells during SDD performance • Impacts the Qo. S due to the special SDD read with extra sensing write instead?
SOLUTION PROPOSAL • Handle stuck cells in the encoder • • Incurs write-time penalty No read-time penalty • We need: • Stuck-cells locations during encoding • • Prior to write, a sensing at suitable read reference to identify opens in the band Redundant bits to explicitly handle stuck cells
SOLUTION DISCUSSION • Total RBER has two components • • Vt distributions overlap, call this ����_ ���� Stuck cells • Total RBER = ��/ 2 + ����_���� • ��, is the probability of a bit being a stuck cell
SOLUTION DISCUSSION • Total RBER has two components • • Vt distributions overlap, call this ����_ ���� Stuck cells • Total RBER = ��/ 2 + ����_���� • ��, is the probability of a bit being a stuck cell Divide and conquer
SOLUTION DISCUSSION Data has two components • Total RBER transformation • Vt distributions overlap, call this ����_ ���� for opens • Stuck cells • Total RBER = ��/ 2 + ����_���� • ��, is the probability of a bit being a stuck cell Divide and conquer
SOLUTION DISCUSSION Data has two components • Total RBER LDPC for errors transformation due to Vt • Vt distributions overlap, call this ����_ ���� for opens overlap • Stuck cells • Total RBER = ��/ 2 + ����_���� • ��, is the probability of a bit being a stuck cell Divide and conquer
DATA TRANSFORMATION Encoded data Stuck cell
FLIP-N-WRITE Data cells 0 1 1 1 0 Stuck [1] S. Cho et al, “Flip-N-Write: A simple deterministic technique to improve PRAM write performance, energy and endurance, ” MICRO, 2009
FLIP-N-WRITE Data cells 0 1 1 1 0 Stuck Conflict × [1] S. Cho et al, “Flip-N-Write: A simple deterministic technique to improve PRAM write performance, energy and endurance, ” MICRO, 2009
FLIP-N-WRITE Data 0 1 cells 1 1 0 1 0 1 Stuck Conflict Flipped Data cells 1 0 × 0 0 1 Stuck [1] S. Cho et al, “Flip-N-Write: A simple deterministic technique to improve PRAM write performance, energy and endurance, ” MICRO, 2009
FLIP-N-WRITE Data 0 1 cells 1 1 0 1 0 1 Stuck Conflict Flipped Data cells 1 0 × 0 0 1 Stuck Agreement [1] S. Cho et al, “Flip-N-Write: A simple deterministic technique to improve PRAM write performance, energy and endurance, ” MICRO, 2009
SECTIONALIZED FNW payload FNW (payload) parity Baseline With FNW • The payload is divided into �� sections • FNW encoding is performed for each section • The �� FNW flag bits are then appended data and encoded by the LDPC encoder • Parity is unprotected from opens with the
IMPACT ON LDPC RATE • Since FNW uses flag bits, the LDPC code becomes weaker by m • • RBER degradation due to the weak code Transformed data works on a lower RBER channel • Let �� be the probability of opens errors post FNW encoding, ��/2< �� • Total RBER = �� +����_
SIMULATION RESULTS • Restores the SDD performance
PERFORMANCE IMPACT • Prep stage: • • Read a page at Vwlrvmax reference voltage Save the open locations • May need to update the opens locations periodically • While programming • Read the saved open locations being programmed
SUMMARY • Handling opens during reads • Data transformation to reduce contention between data and open-reads • A few overhead bits take care of opens • Opens information gathered as background operation • Lesser Qo. S impact
Thank you
- Slides: 24