IVEC OffChip Memory Integrity Protection for Both Security
IVEC: Off-Chip Memory Integrity Protection for Both Security and Reliability Ruirui Huang, G. Edward Suh Cornell University
Motivation ECC Parity ECC Processor IV Off-chip Memory IV Hash ECC Random Error Detection Malicious Attack Detection Random Error Correction Random Transient Errors Integrity Verification (IV) Malicious Attacks IV+ECC It’s easy to compute the Execution aborted ECC parity is bits for thewhen IV fails. attack data. injected Twice the overhead for random error detection!! 2
IVEC – Integrity Verification with Error Correction Can we extend the capability of IV to handle both security and reliability errors with minimal overheads? § § Goal: • Extend IV to correct errors while ensuring a proper level of security • Cover both single-bit and multi-bit errors Challenge • Error correction is essentially finding the erroneous bits • Cryptographic hash in IV does not reveal error locations 3
Outline § Background ECC • Integrity Verification (IV) • § IVEC error correction Single-bit errors • Multi-bit errors • § HW Implementation § Evaluation 4
ECC (SEC-DED) § In general, a modern system uses (72, 64) SEC-DED ECC § For every 64 -bit data, 8 additional parity bits are needed § Memory space and bandwidth overheads of 12. 5% § Correct 1 -bit errors Two extra DRAM chips for 8 -bit parity of ECC DIMM (18 x 4 DRAM chips) DRAM DRAM DRAM DRAM DRAM 17 18 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 72 -bit SEC-DED ECC Word § ECC can be extended to correct common multi-bit errors § Chip-kill correct: correct up to one DRAM chip failure 5
Cryptographic Hash § § IV relies on cryptographic hash to detect any changes on data saved in an un-trusted memory • Fixed length “finger print” of the data • Collision resistance is a key property Message Authentication Code (MAC) is a keyed cryptographic hash that can also be used for IV Hash (h) On data access, check if h == H(d) Data (d) 6
IV - Hash/MAC Trees Integrity verification techniques often rely on hash/MAC trees Any changes in data memory would be detected Size of a cache block H(h 1 || h 2 || h 3 || h 4) Protected data in memory hash h 3 hash h 4 hash hash h 2 hash hash h 1 hash In off-chip memory Previous works suggest that IV’s performance Size of a cache overhead is only 2 -5% block when using Cached MAC Trees hash root hash In processor hash • hash § 7
Outline § Background ECC • Integrity Verification (IV) • § IVEC error correction Single-bit errors • Multi-bit errors • § HW Implementation § Evaluation 8
Single-bit Error Model §A single-bit error in a cache block (64 B) § Error is detected by checking the computed hash value to the stored hash value on-chip DRAM 1 DIMM 1 DRAM 16 DRAM 1 DIMM 4 DRAM 16 1 st Read-block (256 bits) 2 nd Read-block (256 bits) § 64 B cache block, 256 -bits per read-block (2 read-blocks required to fill 1 cache block) 9
Single-bit Error Correction § Correction as searching problem • Flip one bit at a time for all possible combinations, and check if the new value passes the integrity verification DRAM 1 DIMM 1 DRAM 16 DRAM 1 DIMM 4 DRAM 16 Corrected! 1 1 1 0 0 0 01 01 1 st Read-block (256 bits) 10 10 10 1 1 1 1 2 nd Read-block (256 bits) 1 1 1 1 § 64 B cache block, 256 bits per read-block (2 reads required to fill 1 cache block) 10
Multi-bit Error Model § Any bits in one DRAM chip can fail in each readblock • Similar to chip-kill correct DRAM 1 DIMM 1 DRAM 16 DIMM 4 DRAM 16 1 st Read-block (256 bits) 2 nd Read-block (256 bits) § 64 B cache block, 256 bits per read-block (2 reads required to fill 1 cache block) 11
IVEC Error Correction with Parity § Each parity bit covers one bit from every DRAM chip in a read -block • x 4 DRAM: 4 parity bits per read-block DRAM 1 DIMM 1 DRAM 16 DIMM 4 DRAM 16 1 st Read-block (256 bits) P 1 P 2 P 3 P 4 P 5 P 6 P 7 P 8 2 nd Read-block (256 bits) P 1 P 2 P 3 P 4 P 5 P 6 P 7 P 8 P 1 P 4 P 3 P 2 § 64 B cache block, 256 bits per read-block (2 reads required to fill 1 cache block), 8 parity bits 12
IVEC Correction with Parity § Use parity bits to guide our correction search • Correction scheme can be extended with more or fewer number of parity bits DRAM 1 DIMM 1 1 0 0 1 01 1 P 2 P 3 P 4 P 1 P 2 P 3 P 4 1 1 1 1 0 0 P 5 P 6 P 7 P 8 § • DRAM 16 1 st Read-block (256 bits) 2 nd Read-block (256 bits) DIMM 4 DRAM 16 1 1 1 1 P 1 P 2 P 3 P 4 10 1 1 10 P 5 P 6 P 7 P 8 Corrected! 0 1 1 0 P 5 P 6 P 7 P 8 64 B cache block, 256 bits per read-block (2 reads required to fill 1 cache block), 8 parity bits For hard faults, start searching from recent error locations 13
Parity Handling § Parity bits are stored in regular memory space § Parity bits are not needed for reads unless there is an error They are only updated on write-back operations • Decoupled error detection and correction • §A parity cache can be used to load and store parity bits when necessary 14
Outline § Background ECC • Integrity Verification (IV) • § IVEC error correction Single-bit errors • Multi-bit errors • § HW Implementation § Evaluation 15
IVEC Hardware Implementation To memory From memory Parent MAC from cache Counter Cache AES MACQ LDQ L 2 Cache Parity Cache Check GF Multiply IV Queue § Blue – new blocks for IVEC § Yellow – already exist in a system with IV To L 2 Data Queue IVEC Control Result to control Correction Buffer 16
Outline § Background ECC • Integrity Verification (IV) • § IVEC error correction Single-bit errors • Multi-bit errors • § HW Implementation § Evaluation 17
Error Detection § IV detects any error pattern unless there is a hash/MAC collision § Error detection probability depends on the length of the hash/MAC ↑ hash/MAC length, ↓ collision rate • For example, 64 -bit MAC has 1/264 collision rate • 18
Error Correction § Mis-correction happens if there is a hash/MAC collision on a correction attempt Every time a hash is recomputed for a possible correction (correction attempt), there is a chance of a collision • ↑ number of correction attempts, ↑ mis-correction rate • § Security is weakened by correction attempts An integrity violation is not detected on a mis-correction • ↑ number of correction attempts, ↓ security • § Correction • latency GMAC: 4 -8 cycles per correction attempt 19
Worst-Case Numbers § Maximum Parity None number of correction attempts Single-bit Error Multi-bit Error x 4 x 8 x 16 DRAM Chip DRAM Chip Security is reduced by ~8 -bit (64 bits->56 bits) Max correction 512 latency: 4096220 cycles 512 226 240 4 bits 128 128 216 222 236 8 bits 64 64 64 4096 218 232 16 bits 32 32 bits 16 16 16 Security 1024 is reduced 1024 by ~12 -bit 224 (64 bits ->52 bits) 256 correction 256 cycles Max latency: 32768 512 -bit cache block, 256 -bit read-block 20
Memory Space Overhead 35% Memory Space Overhead 30% 25% 20% 15% 10% 5% 0% ECC IV IV+ECC IVEC-NP IVEC-P 4 IVEC-P 8 IVEC-P 16 IVEC-P 32 § ECC: 64 parity bits per cache block (512 bits) § IV: 64 -bit MAC per cache block (512 bits) in a MAC tree structure plus meta-data 21
Performance Evaluation § Run-time overheads Error correction latency: negligible with a typical SER rate • Performance overhead due to off-chip bandwidth usage from updating parity bits • § Tools • Pin instrumentation tool and TAXI performance simulator § Parameters • Core 2 -like single processor: 4 -issue Oo. O core § Baseline • is chosen to have IV implemented 64 -bit GMAC-tree with split counter mode (< 5% overhead) 22
Memory Bandwidth Overhead 10% 9% 8% 7% 6% 5% 4% 3% 2% 1% 0% 9% P 4 P 8 P 16 P 32 3. 2% bzip 2 equake gap gzip mcf mesa twolf geomean § Traditional ECC bandwidth overhead is 12. 5% § IVEC Memory bandwidth overhead is <= 9% in the worst case § Performance overhead is negligible (0. 5% in the worst case) 23
Related Work § Memory § Off-chip integrity verification DRAM ECC • SEC-DED ECC • Chip-kill Correct § Tiered ECC § Reliability and Security Engine (RSE) 24
Conclusion § § IVEC enables efficient protection of off-chip memory from both security attacks and random errors • Can handles both single-bit errors and multi-bit errors • Minimal impact on security IVEC is able to eliminate the use of traditional ECC for offchip memory when a system requires IV for security 25
- Slides: 25