Content Addressable Memories Cell Design and Peripheral Circuits
Content Addressable Memories Cell Design and Peripheral Circuits
CAM: Introduction CAM vs. RAM o Data In 0 1 0 1 0 1 1 0 0 0 1 4 2 1 1 0 0 1 1 0 1 0 1 0 3 1 0 1 1 0 0 4 1 0 1 1 0 0 0 1 2 1 1 0 0 1 1 3 3 1 0 1 1 0 0 0 1 5 1 1 1 0 0 4 1 0 1 1 1 0 0 0 1 Data Out 5 1 1 1 0 0 0 1 1 Address Out Address In 1 1 0 0 0 0
CAM: Introduction o Binary CAM Cell n n n SL 1 c SL 1 ML ML pre-charged to VDD Match: ML remains at VDD Mismatch: ML discharges N 5 N 7 BL 1_cell BL 1 c_cell N 6 N 8 P 1 P 2 N 3 N 1 N 2 BL 1 WL N 4 BL 1 c
CAM: Introduction o Ternary CAM (TCAM) Input Keyword 1 0 1 1 0 X X X 1 0 1 1 0 0 0 1 0 1 0 X 0 1 1 0 1 0 1 2 1 X 0 1 0 0 1 1 3 1 0 1 1 1 0 0 0 1 Match 4 4 1 0 1 1 0 0 1 0 Match 5 1 1 1 0 0 X 0 0 0 1 0 1 0 1 1 1 0 0 0 X Match 2 1 1 0 0 X X 3 1 0 1 1 1 X X X 4 4 1 0 1 1 X X Match 5 1 1 1 X X X
CAM: Introduction o Comparison Logic TCAM Cell n n Global Masking SLs Local Masking BLs BL 1 BL 2 Logic 0 1 1 0 0 1 X N. A. SL 1 SL 2 ML BL 1 c BL 1 BL 2 c BL 2 RAM Cell WL
CAM: Introduction o DRAM based TCAM Cell C D D Higher bit density Slower table update Expensive process Refreshing circuitry Scaling issues (Leakage) SL 2 SL 1 ML N 5 N 7 BL 1_cell BL 2_cell N 6 N 8 N 3 N 4 BL 1 WL BL 2
CAM: Introduction o SRAM based TCAM Cell C C D Standard CMOS process Fast table update Large area (16 T) SL 1 SL 2 ML BL 1 c_cell BL 1 BL 2 c_cell BL 1 c WL BL 2 c BL 2
CAM: Introduction o Block diagram of a 256 x 144 TCAM Search Lines (SLs) SL Drivers ML Sense Amplifiers SL 1(0) SL 1(143) SL 2(0) ML 0 MLSA MLSO(0) Match Lines (MLs) BL 1 c(N) BL 2 c(N) BL 1 c(0) BL 2 c(0) CAM Cell (143) CAM Cell (0) ML 255 BL 1 c(N) BL 2 c(N) CAM Cell (143) MLSA BL 1 c(0) BL 2 c(0) CAM Cell (0) MLSO(255)
CAM: Introduction o Why low-power TCAMs? n Parallel search Very high power (2 Mb Sibercore TCAM 66 MHz 66 Msps 3. 4 W) n IPv 6, OC-768 Larger word size, larger no. of entries High power n Embedded applications (So. C)
CAM: Introduction o Why high-performance TCAMs? n OC-768 135 M packets/s (7. 4 ns/packet) n Application complexity Multiple searches n IPv 6 Larger word size larger search time
CAM: Design Techniques o Cell Design: 12 T Static TCAM cell* n C D D ‘ 0’ is retained by Leakage (VWL ~ 200 m. V) High density Leakage (3 orders) Noise margin Soft-errors (node S) Unsuitable for READ * I. Arsovski, T. Chandler, A. Sheikholeslami, IEEE JSSC, vol. 38, no. 1, pp. 155 -158, Jan. 2003
CAM: Design Techniques o Cell Design: NAND vs. NOR Type CAM C D D Low Power Charge-sharing Slow NAND-type CAM BL 1 VDD CAM Cell (N) M CAM Cell (0) BL 1 c ML_NOR SL 1 c BL 1 VDD BL 1 c CAM Cell (N) WL CAM Cell (1) SA NOR-type CAM SL 1 c SL 1 ML_NAND WL CAM Cell (1) CAM Cell (0) SA MM
CAM: Design Techniques o MLSA Design: Conventional n n n Pre-charge ML to VDD Match VML = VDD Mismatch VML = 0 VDD MLSO PRE ML MM MM
CAM: Design Techniques o MLSA Design: Current Race Sensing* RSTc MLOFF VDD MLSO ML RST MM MM MATCH Delay Dummy ML MLOFF * I. Arsovski, T. Chandler, A. Sheikholeslami, IEEE JSSC, vol. 38, no. 1, pp. 155 -158, Jan. 2003
CAM: Design Techniques o MLSA Design: Current Race Sensing C C D No need to reset SLs in every clock cycle Lower ML voltage swing (Vth + ∆V) ≈ ½VDD Speed Current Voltage Margin MLSO [0] ML [0] Voltage Margin ML [1]
CAM: Design Techniques o MLSA Design: Charge Redistribution* n n n Fast pre-charge ML through MREF Mismatch SP=‘ 0’ MLSO=‘ 1’ IML > IREF > leakage I ∆VML (VREF – Vth) V RST FAST_PRE High power ML VDD FAST_PRE REF C D REF CML RST * P. Vlasenko, D. Perry, MOSAID Technologies Inc. , US Patent 6717876, April 6, 2004 VDD MLSO SP MREF CSP
CAM: Design Techniques o MLSA Design: Charge Injection* n n C D Reset ML and pre-charge CINJ Charge share CINJ and CML Match VML = CINJ x VDD/(CINJ +CML) Mismatch VML = 0 V CHARGE_IN PRE OFFSET SA Small ∆VML ML MLSO C Poor noise margin C RST Area penalty (CINJ) DD INJ ML D * G. Kasai, Y. Takarabe, K. Furumi, and M. Yoneda, SONY Corp. , Proc. IEEE CICC, pp. 387 -390, Sep. 2003
CAM: Design Techniques o Low Power: Selective Pre-charge* n n n MLs: Two segments If MATCH in pre-search Main-search No. of bits in pre-search Data statistics PRE-SEARCH MAIN-SEARCH MLSO 1 MLSA 1 ML 2 MLSA 2 MLSO 2 * C. Zukowski and S. Wang, Proc. IEEE ISCAS, pp. 745 -770, Jun. 9 -12, 1997
CAM: Design Techniques o Low Power: Dual-ML TCAM* n n MLSA 1 is enabled first MLSA 2 is enabled if MLSO 1 = ‘ 1’ SL 1(N) SL 2(N) ML 1 SL 1(0) SL 2(0) ML 1 MLSA 1 MLSO 1 ML 2 BL 1 c(N) BL 2 c(N) CAM Cell (N) BL 1 c(0) BL 2 c(0) CAM Cell (0) * N. Mohan, M. Sachdev, Proc. IEEE ISCAS, pp. 633 -636, May 23 -26, 2004 MLSA 2 MLSO 2
CAM: Design Techniques o Low Power: Dual-ML TCAM n n Cap(ML 1) = Cap(ML 2) = ½ C(ML) Same speed, 50% less energy (Ideally!) Parasitic interconnects degrade both speed and energy Additional ML increases coupling capacitance
CAM: Design Techniques o Low Power: Dual-ML TCAM n Simulation results (144 bits)* o o Interconnect cap. = 27 f. F W/L = 0. 6µm/0. 18µm Old New Difference TS (ns) 8. 14 8. 46 4% E 1 (f. J) 769 426 45% E 2 (f. J) 769 973 26% * N. Mohan, M. Sachdev, Proc. IEEE ISCAS, pp. 633 -636, May 23 -26, 2004
CAM: Design Techniques o ML 1 Low Power: Dual-ML TCAM* Ø Ø Ø EAVG = PML 1 x E 1 +(1 – PML 1) x E 2 SA 1 cannot detect Type I For ‘M’ mismatches, PML 1 = 1 – (0. 5)M Mismatch Type II SL 1 0 1 SL 2 1 0 BL 1 1 0 * N. Mohan, M. Sachdev, Proc. IEEE ISCAS, pp. 633 -636, May 23 -26, 2004 SL 1 BL 1 c BL 2 0 1
CAM: Design Techniques o Low Power: Dual-ML TCAM* * N. Mohan, M. Sachdev, Proc. IEEE ISCAS, pp. 633 -636, May 23 -26, 2004
CAM: Design Techniques o Low Power: Hierarchical SLs* n n C D D D 144 bits (5 segments: 8, 34, 34) SLs Multiple blocks (64 words each) ∆VGSL 0. 45 V (VDD=1. 8 V) Logic complexity Search time/latency 64 -bit OR gates * Pagiamtzis et. al. , Proc. IEEE CICC, pp. 383 -386, Sep. 2003
CAM: Design Techniques o Static Power Reduction n 16 T TCAM: Leakage Paths* SL 1 SL 2 ML N 9 N 11 BL 1 c_cell P 1 BL 1 ‘ 0’ N 3 ‘ 1’ N 1 BL 2 c_cell N 10 P 2 N 12 N 4 P 4 BL 2 c BL 1 c ‘ 0’ P 3 ‘ 1’ ‘ 0’ N 2 BL 2 N 7 ‘ 0’ ‘ 1’ N 5 WL * N. Mohan, M. Sachdev, Proc. IEEE CCECE, pp. 711 -714, May 2 -5, 2004 N 6 N 8 ‘ 1’
CAM: Design Techniques o Static Power Reduction n Technology Scaling 1 o o o n Dimensions 30% Dynamic power 50% Leakage current 5 x Architectural level techniques 2, 3 o A small portion is enabled 1. S. Borkar, IEEE Micro, pp. 23 -29, Jul. -Aug. 1999 2. K. Pagiamtzis, A. Sheikholeslami, Proc. IEEE CICC, pp. 383 -386, Sep. 2003 3. G. Kasai, Y. Takarabe, K. Furumi, M. Yoneda, Proc. IEEE CICC, pp. 387 -390, Sep. 2003
CAM: Design Techniques o Static Power Reduction n n Leakage current* VDD ISUB * R. X. Gu, M. I. Elmasry, IEEE JSSC, vol. 31, no. 5, pp. 707 -713, May 1996 VDD
CAM: Design Techniques o Static Power Reduction Side Effects of VDD Reduction in TCAM Cells C Speed: No change C Dynamic power: No change MLSO [0] D Robustness ML [0] n VDD Volt. Margin Voltage Margin (Current-race sensing) n ML [1]
CAM: Design Techniques o Static Power Reduction n Voltage Margin of 144 -bit TCAM word in 0. 18 µm CMOS* * N. Mohan, M. Sachdev, Proc. IEEE CCECE, pp. 711 -714, May 2 -5, 2004
CAM: Design Techniques o Static Power Reduction n Effects of Technology Scaling* o Berkeley predictive technology model (BPTM) * N. Mohan, M. Sachdev, Proc. IEEE CCECE, pp. 711 -714, May 2 -5, 2004
- Slides: 30