Intel IXP 2 XXX Network Processor Architecture Overview









































- Slides: 41
Intel® IXP 2 XXX Network Processor Architecture Overview John Morgan Infrastructure Processor Division September 2004 Page 1
Agenda IXP 2400 External Features IXP 2800 External Features Comparison of IXP 2400 and IXP 2800 IXP 2 XXX Resource Overviews – MEv 2 Overview – QDR SRAM Overview – DDR Overview – RDRAM Overview – PCI Overview – MSF Overview – Miscellaneous
IXP 2400 External Features External Interfaces Host CPU (Optional) PCI 64 -bit / 66 MHz IXA SW Classification Accelerator Co. Proc Bus IXP 2400 Micro. Engine Clusters Customer ASICs Flash Slow Port Utopia 1, 2, 3 SPI – 3 (POS-PL 3) CSIX QDR SRAM 1. 6 GBs 64 M Byte Switch Fabric Port Interface DDR DRAM 2 GByte IXP 2400 (Egress) (Ingress) Utopia 1/2/3 or POS-PL 2/3 Interface Flow Control Bus ATM / POS PHY or Ethernet MAC MSF Interface supports UTOPIA 1/2/3, SPI-3 (POS-PL 3), and CSIX. Four independent, configurable, 8 -bit channels with the ability to aggregate channels for wider interfaces. Media interface can support channelized media on RX and 32 -bit connect to Switch Fabric over SPI-3 on TX (and vice versa) to support Switch Fabric option. 2 Quad Data Rate SRAM channels. A QDR SRAM channel can interface to Co-Processors. 1 DDR SDRAM channel. PCI 64/66 Host CPU interface. Flash and PHY Mgmt interface. Dedicated inter-IXP channel to communicate fabric flow control information from egress to ingress for dual chip solution.
IXP 2400 Full-Duplex OC-48 System Implementation QDR SRAM Q Queues & D Tables R Q D R S D R A M DDR SDRAM Packet Memory IXP 2400 Ingress Processor IXF 6048 Framer 1 x OC-48 or 4 x OC-12 OC-48 OC 48 IXP 2400 Egress Processor QDR SRAM Queues & Tables Q D R S D R A M DDR SDRAM Packet Memory Host CPU (IOP or i. A) T C A M Classificati on Accelerato r Ingress Processor SAR’ing Classification Metering Policing Initial Congestion Management Switch Fabric Gasket T C A M Classificati on Accelerato r Egress Processor Traffic Shaping Flexible Choices diff serve TM 4. 0 …
IXP 2400 Chaining Glueless Interface between IXP 2400 Devices using CSIX-L 1 Control Plane Processor PCI 64/66 IXP 2400 Processor 2. 5 Gbs SPI 3 Q D R QDR SRAM Queues & Tables IXP 2400 Processor 2. 5 Gbs CSIXL 1 D R A M DDR Packet Memory 2. 5 Gbs CSIXL 1 Q D R QDR SRAM Queues & Tables D R A M DDR Packet Memory
IXP 2400 72 MEv 2 1 DDRAM MEv 2 2 Rbuf 64 @ 128 B Intel® XScale™ Core 32 K IC 32 K DC PCI 64 b (64 b) 66 MHz G A S K E T MEv 2 4 MEv 2 3 Tbuf 64 @ 128 B MEv 2 5 MEv 2 6 32 b S P I 3 or C S I X 32 b Hash 64/48/128 Scratch 16 KB QDR SRAM 1 QDR SRAM 2 E/D Q 18 18 MEv 2 7 CSRs -Fast_wr -UART -Timers -GPIO -Boot. ROM/Slow Port
IXP 2400 Bandwidths 600 MHz Operation 4. 8+ GOPs 2. 5 Gb/s Full Duplex Media Interface – POS-PHY – Utopia – CSIX-L 1 2. 4 GBs DDR Memory Bandwidth at 300 MTs 1. 6 GBs QDR Memory Bandwidth with 200 MHz QDRII devices
IXP 2400 Resources Summary Half Duplex OC-48 / 2. 5 Gb/sec Network Processor (8) Multi-Threaded Microengines Intel® XScale™ Core Media / Switch Fabric Interface PCI interface 2 QDR SRAM interface controllers 1 DDR SDRAM interface controller 8 bit asynchronous port – Flash and CPU bus Additional integrated feature – – – Hardware Hash Unit 16 KByte Scratchpad Memory, Serial UART port 8 general purpose I/O pins Four 32 -bit timers JTAG Support
Agenda IXP 2400 External Features IXP 2800 External Features Comparison of IXP 2400 and IXP 2800 IXP 2 XXX Resource Overviews – MEv 2 Overview – QDR SRAM Overview – DDR Overview – RDRAM Overview – PCI Overview – MSF Overview – Miscellaneous
IXP 2800 External Features Host CPU (Optional) PCI 64 -bit / 66 MHz QDR SRAM 12. 8 Gbps x 4 64 M Byte x 4 channels IXA SW Classification Accelerator Co. Proc Bus IXP 2800 RDR DRAM 50+Gbps 2 Gbyte total for 3 channels Micro. Engine Clusters Customer ASICs Flash IXP 2800 (Egress) (Ingress) Slow Port SPI – 4, CSIX-L 1 SPI-4 or CSIXL 1 Switch Fabric Port Interface Flow Control Bus ATM / POS PHY or Ethernet MAC External Interfaces Media Interface supports both SPI-4 and CSIX 4 Quad Data Rate (QDR) SRAM channels Each channel can interface to Coprocessors 3 RDRAM Channels PCI 64/66 Host CPU interface Flash and PHY Management interface Dedicated inter-IXP channel to communicate fabric flow control information from egress to ingress for dual chip solution
10 Gb/s SONET Line Card QDR SRAM Queues & Tables Q D R D R A M D RDR R A Packet M Memory Control Plane Processor PCI 64/66 CDR, DEMUX IXF 18101 10 Gb. E SPI OC-192 c I/F CSIX Fabric I/F 10 Gbs 10 Gb. E WAN / PPP/ ATM/ OTN / SONET/ SDH QDR SRAM Queues & Tables Fabric Interface Chip (FIC) 15 Gbs 10 Gbs Flow Ctl CDR, DEMUX IXP 2800 Ingress Processor 15 Gbs IXP 2800 Egress Processor Q D R Ingress Processor SAR’ing Classification Metering Policing Initial Congestion Management Q D R D R A M D RDR R Packet A Memory M Egress Processor Traffic Shaping Flexible Choices diff serve TM 4. 1 …
IXP 2800 System with SPI gasket QDR SRAM Queues & Tables Q D R D R A M RDR Packet Memory Control Plane Processor PCI 64/66 IXP 2800 Ingress Processor 10 Gbs Utopia 3 SPI 2 U 3 10 Gbs SPI 4 2 U 3 x SPI gasket Dual CSIX
IXP 2800 Chaining • Glueless interface between IXP 2800 devices using SPI-4. 2 Control Plane Processor PCI 64/66 IXP 2800 Processor 10 Gbs SPI-4 Q D R Q D R QDR SRAM Queues & Tables D R A M IXP 2800 Processor 10 Gbs SPI-4 D R A M RDR Packet Memory Q D R QDR SRAM Queues & Tables D R A M 10 Gbs SPI-4 D R A M RDR Packet Memory
18 18 18 IXP 2800 Stripe RDRAM 1 RDRAM 2 MEv 2 1 RDRAM 3 MEv 2 2 MEv 2 3 MEv 2 4 Rbuf 64 @ 128 B PCI 64 b (64 b) 66 MHz MEv 2 8 G A S K E T Intel® XScale™ Core 32 K IC 32 K DC MEv 2 7 MEv 2 6 MEv 2 5 Tbuf 64 @ 128 B MEv 2 9 MEv 2 10 MEv 2 11 16 b S P I 4 or C S I X 16 b MEv 2 12 Hash 48/64/128 QDR SRAM 1 QDR SRAM 2 QDR SRAM 3 QDR SRAM 4 E/D Q 18 18 18 Page 14 18 18 18 MEv 2 16 MEv 2 15 MEv 2 14 MEv 2 13 Scratch 16 KB CSRs -Fast_wr -UART -Timers -GPIO -Boot. ROM/Slow. Port
IXP 2800 Bandwidths 1. 4 GHz Operation 20+ GOPs 10 Gbs Full Duplex Media Interface – SPI-4. 2 – CSIX-L 1 1. 9 GB/s QDR SRAM Memory Bandwidth/Channel 2. 1 GB/s RDRAM Memory Bandwidth/Channel
IXP 2800 Resources Summary Half Duplex OC-192 / 10 Gb/sec Network Processor (16) Multi-Threaded Microengines Intel® XScale™ Core Media / Switch Fabric Interface PCI interface 4 QDR SRAM Interface Controllers 3 Rambus* DRAM Interface Controllers 8 bit asynchronous port – Flash and CPU bus Additional integrated features – Hardware Hash Unit for generating of 48 -, 64 -, or 128 -bit adaptive polynomial hash keys – 16 KByte Scratchpad Memory – Serial UART port for debug – 8 general purpose I/O pins – Four 32 -bit timers – JTAG Support
Agenda IXP 2400 External Features IXP 2800 External Features Comparison of IXP 2400 and IXP 2800 IXP 2 XXX Resource Overviews – MEv 2 Overview – QDR SRAM Overview – DDR Overview – RDRAM Overview – PCI Overview – MSF Overview – Miscellaneous
IXP 2800 and IXP 2400 Comparison IXP 2800 IXP 2400 1. 4/1. 0 GHz/ 650 MHz 600/400 MHz DRAM Memory 3 channels RDRAM 800/1066 MHz; Up to 2 GB 1 channel DDR DRAM 150 MHz; Up to 2 GB SRAM Memory 4 channels QDR (or coprocessor) 2 channels QDR (or coprocessor) Media Interface Separate 16 bit Tx & Rx configurable to SPI-4 P 2 or CSIX_L 1 16 (MEv 2) Separate 32 bit Tx & Rx configurable to SPI-3, UTOPIA 3 or CSIX_L 1 8 (MEv 2) Dual chip full duplex OC 192 Dual chip full duplex OC 48 Frequency Number of Micro. Engines Performance
Agenda IXP 2400 External Features IXP 2800 External Features Comparison of IXP 2400 and IXP 2800 IXP 2 XXX Resource Overviews – MEv 2 Overview – QDR SRAM Overview – DDR Overview – RDRAM Overview – PCI Overview – MSF Overview – Miscellaneous
Micro. Engine v 2 D-Push Bus From Next Neighbor Local Memory 128 GPR 128 Next Neighbor S-Push Bus 128 D Xfer In 128 S Xfer In 640 words LM Addr 1 LM Addr 0 2 per CTX B_op 4 K/8 K Instructions A_op Prev B Control Store Prev A P-Random # CRC remain Other Local CSRs Multiply Find first bit Add, shift, logical 32 -bit Execution Data Path TAGs 0 -15 Lock 0 -15 Status Entry# ALU_Out To Next Neighbor Timers Timestamp 128 D Xfer Out D-Pull Bus Status and LRU Logic (6 -bit) CAM CRC Unit B_Operand A_Operand 128 S Xfer Out S-Pull Bus
Microengine v 2 Features – Part 1 Clock Rates – IXP 2400 – 600/400 MHz – IXP 2800 - 1. 4/1. 0 GHz/ 650 MHz Control Store – IXP 2400 – 4 K Instruction store – IXP 2800 – 8 K Instruction store Configurable to 4 or 8 threads – Each thread has its own program counter, registers, signal and wakeup events – Generalized Thread Signaling (15 signals per thread) Local Storage Options – – 256 GPRs 256 Transfer Registers 128 Next Neighbor Registers 640 - 32 bit words of local memory
Microengine v 2 Features – Part 2 CAM (Content Addressable Memory) – Performs parallel lookup on 16 - 32 bit entries – Reports a 9 -bit lookup result – 4 State bits (software controlled, no impact to hardware) – Hit – entry number that hit; Miss – LRU entry – 4 -bit index of Cam entry (Hit) or LRU (Miss) – Improves usage of multiple threads on same data CRC hardware – – – IXP 2400 - Provides CRC_16, CRC_32 IXP 2800 - Provides CRC_16, CRC_32, i. SCSI, CRC_10 and CRC_5 Accelerates CRC computation for ATM AAL/SAR, ATM OAM and Storage applications Multiply hardware – Supports 8 x 24, 16 x 16 and 32 x 32 – Accelerates metering in Qo. S algorithms – Diff. Serv, MPLS Pseudo Random Number generation – Accelerates RED, WRED algorithms 64 -bit Time-stamp and 16 -bit Profile count
Intel® XScale™ Core Overview High-performance, Low-power, 32 -bit Embedded RISC processor Clock rate – IXP 2400 600 MHz – IXP 2800 700/500/325 MHz 32 Kbyte instruction cache 32 Kbyte data cache 2 Kbyte mini-data cache Write buffer Memory management unit
QDR SRAM Overview Controller Configuration – IXP 2400 - 2 channels – IXP 2800 - 4 channels – Optional parity (support for x 16 or x 18 parts) Address up to 64 Mbytes of SRAM per channel Pin design supports up to 4 SRAM loads Supports Burst of 2 QDR Devices Supports byte parity bits [8], [17] for byte 0/1 Parity can be enabled/disabled per channel in SRAM_control CSR
QDR SRAM Overview Peak bandwidth of 1. 6 GBytes/sec per channel – Using 200 MHz SRAMs Specialized SRAM operations: – Atomic swap, bit set, bit clear, add, subtract – Hardware support for ring, queue and journal operations – 64 Q_Array registers per channel Interface to QDR compatible TCAMs and Co. Processors – Network Processor Forum LA-1 Co-Processor Standard Compliant – “Clamshell” topology enables both Memory and Coprocessor to share the same channel
IXP 2400 DDR DRAM Overview 1 64 -bit (72 -bit with ECC) SDRAM channel DRAM sizes of 64 Mb, 128 Mb, 256 Mb, 512 Mb, or 1 Gb – Max capacity is 2 GB (using 1 Gb parts) – Support x 8 or x 16 devices, DIMM or direct soldered – Support devices with 4 banks – Support 1 or 2 sided DIMM – Optional ECC 200/300 MTS, 100 MHz/150 MHz respectively Hardware Interleaving spreads contiguous addresses across multiple banks
IXP 2800 RDRAM Overview 3 Independent Rambus* DRAM Channels which operate concurrently 1. 6 GBytes/s (12. 8 Gbps) per channel at 800 MHz Maximum total of 2 GBytes – 768 MBytes each if 3 channels are populated – 1 GBytes each if only 2 channels are populated – 2 GBytes if only 1 channel is populated Supports 64 Mb, 128 Mb, 256 Mb, 512 Mb and 1 Gb devices Supports RDRAMS with 1 x 16, 2 x 16 dependent and 4 independent Banks Optional ECC and Parity Support Interleaving implemented in HW provides balanced access across all channels – Interleave size is 128 bytes
PCI Interface Overview PCI 2. 2 compliant PCI Bus Target – – – SRAM DRAM Control and Status Registers PCI Bus Master to other devices DMA channels – IXP 2400 – 3 Channels – IXP 2800 – 2 Channels Doorbell and Mailbox Registers Loads: – 4 loads at 66 MHz – 8 loads at 33 MHz
IXP 2400 Media Switch Fabric Interface Protocols – POS-PHY Levels 2 and 3 – Utopia Levels 1, 2 and 3 – CSIX-L 1 for Switch Fabric Interface LVTTL IO (3. 3 V) 32 -bit receive, 32 -bit transmit 25– 133 MHz 8 KB receive buffer and 8 KB transmit buffer
IXP 2800 Media Switch Fabric Interface Protocols – SPI-4 Phase 2 for Network Device – CSIX-L 1 for Switch Fabric Interface LVDS IO (IEEE 1596. 3, ANSI/TIA/EIA-644) 16 -bit receive, 16 -bit transmit 311– 500 MHz 8 KB receive buffer and 8 KB transmit buffer
Miscellaneous UART – Standard RS 232 primarily for debugging TIMER – 4 - 32 bit timers – Timer 4 can be used as Watchdog Timer GPIO – 8 General Purpose IO pins – Can be used as interrupt source to XScale core or clock to timers Interrupt Controller – Provides the ability to enable or mask interrupts from a number of chip wide sources like timers, PCI devices, DRAM ECC errors, etc. Slow Port – Used for Flash ROM access and 8, 16, or 32 -bit asynchronous device access – Allows XScale do read/ write data transfers to these slave devices
Backup
IXP 2400 Target Application IXP 2400 will provide IXP 1200 customers a performance upgrade for OC-12 applications and enable multi Gigabit Ethernet platforms up to OC-48 WAN Edge/Access Aggregation – Includes IP Service Switches, Multiservice Switches, DSLAM, Cable Head End Wireless Infrastructure Layer 4 -7 Switches – Includes Firewall, Server Offload, Content-Based Load Balancing
IXP 2800 Target Application Metropolitan Area Network (MAN) switches and routers Internet core access switches and routers Multi-service switches 10 Gbs enterprise switches and routers supporting tomorrow’s data centers, Storage area networks (SAN) Content aware server off-load/web switches. Security/VPN solutions Wireless base stations Digital Subscriber Line Access Multiplexers (DSLAMs).
Oahu Quad Gig Phy Vallejo 4 x 1 G Ethernet MAC 1 Gig LAN Backbone or Server Farm Utopia 3 IXP 2400 IXP SPI-3 (Utopia 3 Packet) Phy Interface SPI-3 (Utopia 3 Packet) 1 Gig LAN or Server Farm Edge Multi-Service Switch - WAN/LAN Solutions IXP 2400 IXP OC-48 c ATM SAR & Traffic Manager Amazon IXF 6048 OC-48 c ATM & POS Framer WAN Backbone (ATM, SONET) Optical Ring OR CSIX Switch Fabric 80 Gig – 1+ Terabit Switch Fabric
Edge Server Offload PCI Bus Vallejo 4 x 1 G Ethernet MAC 1 Gig LAN Backbone or Server Farm IXP 2400 IXP CSIX Level 1 Oahu Quad Gig Phy SPI-3 (Utopia 3 Packet) Server Farm Phy Interface Host CPU (IOP or i. A) IXP 2400 IXP CSIX Switch Fabric 80 Gig – 1+ Terabit Switch Fabric
IXP 2400 Media Configurations Q D R T C A M D D R Q D R Xscale IXP 2400 Rx 32 bit Q D R T C A M 32 bit Utopia 1/2/3 Or SPI-3 (POS-PHY 2/3) Or CSIX_L 1 B Rx 16 bit Rx & Tx paths each have 2 separate clock domains for asynchronous traffic D D R Xscale IXP 2400 Tx Q D R T C A M IXP 2400 Tx 32 bit Utopia 1/2/3 Or SPI-3 (POS-PHY 2/3) Rx Tx 8 bit 32 bit Q D R T C A M Xscale IXP 2400 Rx Each Rx & Tx path may be configured to be single 32 bit, quad 8 bit, dual 16 bit or combination of 8 & 16 bit wide buses D D R Tx 16 bit 8 bit 16 bit
10 Port 1 Gb/s Ethernet Line Card QDR SRAM Queues & Tables Q D R D R A M D RDR R A Packet M Memory Control Plane Processor PCI 64/66 IXP 2800 Ingress Processor Ben Nevis CSIX Fabric I/F Flow Ctl 10 x 1 Gb. E SPI I/F 10 Gbs 10 x 1 Gb. E LAN QDR SRAM Queues & Tables Fabric Interface Chip (FIC) 15 Gbs 10 Gbs 15 Gbs IXP 2800 Egress Processor Q D R Ingress Processor SAR’ing Classification Metering Policing Initial Congestion Management Q D R D R A M D RDR R Packet A Memory M Egress Processor Traffic Shaping Flexible Choices diff serve TM 4. 1 …
10 Gb/s to Infiniband QDR SRAM Queues & Tables Q D R D R A M D RDR R A Packet M Memory Control Plane Processor PCI 64/66 10 Gb. E SPI 10 x 1 Gb I/F OC-192 c IXP 2800 Ingress Processor Infiniband Fabric 15 Gbs 10 Gbs Flow Ctl Calypso Ben Nevis Loch Lomond 2. 5 Gbps CSIX Fabric I/F 10 Gbs 15 Gbs IXP 2800 Egress Processor QDR SRAM Queues & Tables Q D R D R A M 2. 5 Gbps D RDR R Packet A Memory M 2. 5 Gbps
10 Gbs Ethernet to SONET QDR SRAM Queues & Tables Q D R D R A M D RDR R A Packet M Memory Control Plane Processor PCI 64/66 Disk Farms Calypso 10 Gb. E SPI 10 x 1 Gb I/F SPI OC-192 I/F 4 x. OC 48 10 Gbs IXP 2800 Egress Processor QDR SRAM Queues & Tables Q D R Metro 10 Gbs Flow Ctl Server or Loch Lomond Ben Nevis IXP 2800 Ingress Processor Q D R D R A M D RDR R Packet A Memory M Or WAN
Media / Fabric Receive Logic: Thread moves data 6 Auto. Push Status to Thread 5 Bit vector Status Word Per element Rbuf 64/128 Elements 4 7 Thread pushes ID onto Freelist Rbuf Freelist Thread Freelist 1 Create Status 128/64 B each Receive State Machine Get Free element # 3 Assign thread # Media Switch Fabric Idle bucket Discarded if idle packet 2 Data Arrives Unit SPI-4. 2 Frame Pkt ctrl Pkt payload a Pkt ctrl Cell payload Pkt ctrl Pkt payload b buffer Port A ATM Cell packet Port B Media Device