Qo S in an Ethernet world Bill Lynch

Qo. S in an Ethernet world Bill Lynch Founder & CTO www. procket. com

Qo. S • Why is it needed? (Or is it? ) • What does it do? (Or not do? ) • Gotchas…. • Why is it hard to deploy? 2 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Triple play data networks VOD, CONF, Data services Interface content mirroring for security requirements High-speed Ethernet Edge • Assured Qo. S • DOS prevention Edge PE Distributi on IP or MPLS or λ Core VPN A CE CE VPN B CE PE VPN A Broadband Home Centralized Headend • Video, voice, data over ethernet. • Qo. S across thousands of subscribers • SLAs and differential pricing Headen d Computationa l Particle Physicist 3 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Triple play data characteristics • Voice • Many connections • Low BW/connection • Latency/jitter requirements • Video • Few sources • Higher BW • Latency • Data • Many connection • Unpredictable BW • BE generally okay • Computational particle physicist • Very high peak BW & duration • Very few connections 4 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Router Qo. S Physical Port Physical Port 5 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Router Qo. S == which packet goes first Only matters under congestion Physical Port Physical Port 6 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Router Qo. S Inherent packet jitter Worse: N simultaneous arrivals Bad: Per hop! Worse: Bigger MTU Physical Port Physical Port 7 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Inherent jitter (per hop!) Fundamental conclusion: Qo. S more important at edge Edge also more likely to congest FE OC-12 GE OC-12 OC-192 8 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Gotchas…. • Already no guarantees from simultaneous arrival… … but hope the total worst case is < 10 ms? • And what if your router wasn’t perfect? 9 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

What is Queue Sharing? Queue Sharing is when multiple physical or switch fabric connections must share queues. Example: Each input linecard has two queues for each output linecard. All packets in a shared queue are treated equally. Physical Port HI Queue LO Queue Physical Port 10 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

What is Head of Line Blocking? When an output linecard becomes congested, traffic becomes congested on the input linecard Traffic control (W/RED) must be performed at input VOQ. Physical Port HI Queue LO Queue Physical Port 11 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

What is Head of Line Blocking? The output linecard cannot process all of the output traffic. Because all traffic in a shared queue (VOQ) is treated equally, we have affected traffic on the uncongested port. Physical Port HI Queue LO Queue Physical Port 12 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Queue Sharing Test Results Congested port (Flows C, D, E) remained at 100% throughput Uncongested (Flows A, B) were penalized because of Queue Sharing 13 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

The effects of Queue Sharing With the presence of Queue Sharing, congestion can severely affect the performance of noncongested ports Congestion is caused by: Topology Changes Routing Instability Denial of Service Attacks High Service Demand Misconfiguration of systems or devices 14 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Output Queued Architectures - PRO/8000 Only one queuing location exists in the entire system 36, 000 unique hardware queues Protected on a queue Incoming packets bandwidth are immediately placed into a unique output queue Physical Port Physical Port Centralized Shared Memory Switch Fabric Physical Port 15 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Output Queued Architectures - PRO/8000 Only one queuing location exists in the entire system Over 36, 000 unique hardware queues Bandwidth is protected on a per-queue Incoming packets are immediately placedbasis into a unique output queue Physical Port Physical Port Centralized Shared Memory Switch Fabric Physical Port 16 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Output Queued Architectures - PRO/8000 Traffic control (W/RED) is performed on each output queue individually Protected bandwidth for every single queue Physical Port Physical Port Centralized Shared Memory Switch Fabric Physical Port 17 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Pro/8812 Test Results Congested port (Flows C, D, E) remained at 100% throughput Uncongested (Flows A, B) remained at 100% throughput 18 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Triple play data characteristics • Voice • Many connections • Low BW/connection • Latency/jitter requirements • Video • Few sources • Higher BW • Latency • Data • Many connection • Unpredictable BW • BE generally okay • Computational particle physicist • Very high peak BW & duration • Very few connections 19 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Network Qos architectures Network Predictability Qo. S PSTN 50 years fixed BW TDM Cable MSO 50 years transmit only Provision and broadcast Data Evolving Over-provision 20 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Qo. S Deployment Issues • Political • Commercial • Peers • Easier short term solutions to problems • Cheaper alternatives • Equipment • Qo. S is end to end • Many queues/port • Many shapers/port • Fast diffserv/remarking • Computation expense • Applications • Not tuned or aware • Qo. S not ‘required’ for the application • Geographical • Operational • Must deploy everywhere • Must police at the edge • Last mile technologies • Single provider network • Green field deployments 21 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Summary • Triple play requires Qo. S • Services drive quality • Most routers aren’t perfect • Shared queues mean you can’t provision a port independently • Political and deployment problems remain • Some geographic areas better suited 22 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

23 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Never underestimate the power of Moore’s Law SC LCU 297 sq mm (17. 26 mm x 17. 26 mm) 30. 5 M transistors 47 M contacts 50 KBytes of memory 425 sq mm (20. 17 mm x 21. 07 mm) 137 M transistors 188 M contacts 950 KBytes of memory Striper 429 sq mm (20. 17 mm x 21. 29 mm) 156 M transistors 265 M contacts 1. 2 MBytes of memory NPU Architecture 429 sq mm (20. 17 mm x 21. 29 mm) 214 M transistors 400 M contacts 2. 6 MBytes of memory MCU GA 389 sq mm (19. 05 mm x 20. 4 mm) 106 M transistors 188 M contacts 1. 2 MBytes of memory 225 sq mm (15. 02 mm x 15. 02 mm) 83 M transistors 136 M contacts 900 KBytes of memory 24 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

NPU – 40 G Qo. S lookups VLIW systolic Array • Packet advances every cycle • Named bypassing • > 200 processors Px. U Lx. U FTSRAM PBU pacman Qx. U • 4 ops/cycle/processor • 12 loads every cycle • (1 Tb memory BW) IPA 25 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

NPU VLIW systolic Array • Normal instruction set • • Px. U Arithmetic Logical Lx. U FTSRAM Branch IPA PBU Load pacman • Simple programming model • Deterministic performance Qx. U 26 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Memory Controller – Service Level Queueing • High BW • 16 DRAM chips • independent memory banks • BW dist. across banks • 36 K queues • Memory management • Write-once multicast • Preserve ordering 27 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Basic Router Architecture Elements Linecard Switch Fabric Linecard Three Classes of Switch Fabric Architecture - Input Queued (IQ) - Output Queued (OQ) - Combined Input/Output Queued (CIOQ) 28 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Input Queued (IQ) Fabrics Input Linecard Switch Fabric Ouput Linecard Input Queued Switch Fabrics: Inefficient use of memory Require Complex Scheduling 29 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Combined Input/Output Queued (CIOQ) Fabrics Input Linecard Switch Fabric Ouput Linecard CIOQ Switch Fabrics: Generally with point-to-point fabric in the middle (Crossbar, multi-stage (clos), torus) Requires Complex Scheduling Queues shared to reduce complexity 30 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Output Queued Fabrics Input Linecard Switch Fabric Ouput Linecard OQ Switch Fabrics: Require extremely high speed memory access Do not share queues Efficient multicast replication Protected bandwidth per queue 31 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Terabit Centralized Shared Memory Routers April 20, 2004 Bill Lynch CTO www. procket. com

Whither Qo. S? April 20, 2004 Bill Lynch CTO www. procket. com

Concurrent Services VOD, CONF, Data services Interface content mirroring for security requirements High-speed Ethernet Edge • Assured Qo. S • DOS prevention Edge PE Distributi on IP MPLS λ VPN A CE CE VPN B CE PE VPN A Broadband Home Centralized Headend • Video, voice, data over ethernet. • Qo. S across thousands of subscribers • SLAs and differential pricing Headen d CONFIDENTIAL Research, Education, Grid, Supercomputi ng 34 © 2004 Procket Networks, Inc. All rights reserved.

35 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

(More Bill’s Slides Here) • (As much detail on the switch fabric and chips as you are comfortable saying in a multi-vendor environment!) • No scheduling • 36 K service level queues • NPU for fast lookup, policing, shaping • SW abstraction based on service performed, not provided knobs • Many, many DRAM banks. However, ½ as many as CIOQ architectures. • 40 G NPU for line rate • Policing • Remarking • DA, AS, other lookup • SW interface focus on service, not knobs. 36 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

(Insert Bill’s Slides Here) • Self Introduction • Problem Statement (Bill) • "Layer 3 Qo. S at the right scale price is elusive"Throwing more bandwidth at lower layers only makes networking researchers commodity bandwidth brokers. Also that is fine for R&E but commercially that is too expensive, so there appears to be a growing disconnect between R&E and commercial. It will be important not to slam the current L 2/L 1 vogue lest we upset the locals : ) • Numerous commercial implementations starting now • Single network country • High BW to home • Triple play • Assertion (Bill) • "System Architecture greatly contributes to the properation of network wide Qo. S"Current system architecture are completely unfocused on network wide Qo. S, and focused on per-hop-behaviors. This forces networkers to tweak 100 knobs to get the desired behavior. Why not architect the system to protect a flow through the router, so that behaviors are predictable in every circumstance? • End 2 end. Any problem exacerbated by TCP. CONFIDENTIAL 37 © 2004 Procket Networks, Inc. All rights reserved.

Abilene Network Map Source: http: //abilene. internet 2. edu/new/upgrade. html 38 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Internet Growth Predictions “ 117% YEARLY GROWTH THROUGH 2006” “VIDEO WILL DRIVE TRAFFIC GROWTH OVER THE NEXT 10 YEARS” Source: Yankee Group April 2004 39 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Network Reference Design Single Element Core (Cluster) Interdomain Qo. S Peers Concurrent Services Edge Intradomain Qo. S 40 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

PRO/8000 TM Concurrent Services Routers Highest performance and density 960 Gbps 2 per rack Ultra-compact 80 Gbps rack 8 per 41 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

PRO/8000 Series Logical Architecture # Procket VLSI Forwarding Plane CP Route Processors (1+1) CP Control Plane 1 1 5 2 CP 4 3 Line Card Switch Cards (2+1) Media Adapters • Fully redundant Switch Cards and Route Processors • All components hotswappable in-service CONFIDENTIAL Line Card 5 5 5 Media Adapters • No single point of failure • Strictly non-blocking 42 © 2004 Procket Networks, Inc. All rights reserved.

Basic Router Architecture Elements Linecard Switch Fabric Linecard Three Classes of Switch Fabric Architecture - Input Queued (IQ) - Output Queued (OQ) - Combined Input/Output Queued (CIOQ) 43 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Input Queued (IQ) Fabrics Input Linecard Switch Fabric Ouput Linecard Input Queued Switch Fabrics: Inefficient use of memory Require Complex Scheduling 44 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Combined Input/Output Queued (CIOQ) Fabrics Input Linecard Switch Fabric Ouput Linecard CIOQ Switch Fabrics: Generally with point-to-point fabric in the middle (Crossbar, multi-stage (clos), torus) Requires Complex Scheduling Queues shared to reduce complexity 45 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Output Queued Fabrics Input Linecard Switch Fabric Ouput Linecard OQ Switch Fabrics: Require extremely high speed memory access Do not share queues Efficient multicast replication Protected bandwidth per queue 46 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

What is Queue Sharing? Queue Sharing is when multiple physical or switch fabric connections must share queues. Example: Each input linecard has two queues for each output linecard. All packets in a shared queue are treated equally. Physical Port HI Queue LO Queue Physical Port 47 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

What is Head of Line Blocking? When an output linecard becomes congested, traffic becomes congested on the input linecard Traffic control (W/RED) must be performed at input VOQ. Physical Port HI Queue LO Queue Physical Port 48 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

What is Head of Line Blocking? The output linecard cannot process all of the output traffic. Because all traffic in a shared queue (VOQ) is treated equally, we have affected traffic on the uncongested port. Physical Port HI Queue LO Queue Physical Port 49 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Queue Sharing Test Results Congested port (Flows C, D, E) remained at 100% throughput Uncongested (Flows A, B) were penalized because of Queue Sharing Traffic on adjacent ports was dropped! 50 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Output Queued Architectures - PRO/8000 Only one queuing location exists in the entire system Over 36, 000 unique hardware queues Protectedpackets bandwidth down to DS 3 placed granularity Incoming are immediately into a unique output queue Physical Port Physical Port Centralized Shared Memory Switch Fabric Physical Port 51 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Output Queued Architectures - PRO/8000 Only one queuing location exists in the entire system Over 36, 000 unique hardware queues Protectedpackets bandwidth down to DS 3 placed granularity Incoming are immediately into a unique output queue Physical Port Physical Port Centralized Shared Memory Switch Fabric Physical Port 52 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Output Queued Architectures - PRO/8000 Traffic control (W/RED) can be performed on each output queue individually! Protected bandwidth for every single queue Physical Port Physical Port Centralized Shared Memory Switch Fabric Physical Port 53 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Multicast Scaling and Performance by Design Content Incoming Line Card Media Adapters Centralized Shared Memory Switch Fabric Outgoing Line Cards 1. One copy of packet 2. Output Line Cards read copy written into memory of packet out of memory 3. Copy packet to each outgoing interface 54 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

State-of-the-art Networking Software üEmbedded selfüLightweight kernel diagnostics üFully modular ü In service upgrades üMemory protection üAutomatic image üInherent fault isolation rollback üRestartable processes üSimple to extend üRapid recovery from failures üModular forwarding code üFast messaging between processes üBuilt in portability 55 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Portable, Lightweight Kernel Portability Ensures Longevity & Consistency of System Software, Lightweight Operating System Maximizes System Stability Lightweight Kernel to handle scheduling and memory allocation • Portability of PRO/1 MSE is built in • System software features can easily be moved to new platforms • Stripped down to essential functions to maximize stability • No networking functions or services can crash the system • Designed for mission critical applications CONFIDENTIAL 56 © 2004 Procket Networks, Inc. All rights reserved.

Modular versus Monolithic Alternatives BGP OSPF … IGMP Fully modular PIM … SNMP CLI Intelligent …Service Agent Other System Manager Lightweight Kernel Semi-monolithic BGP CLI Interfaces IS-IS, OSPF PIM, MSDP, SSM, IGMP Kernel Monolithic CLI Interface Mgr BGP IS-IS, OSPF PIM, MSDP, SSM, IGMP Kernel 57 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Improve Network Availability, Simplify Operations n-service software upgrades increase network availabili and simplify network operations New SNMP Package Base Release BGP OSPF … IGMP PIM … Intelligent SNMP CLI …Service Other Agent 1. 2. 3. Procket Package Manager checks compatibility of base release and SNMP package While SNMP package is installed, all protocols continue to operate Once installed, SNMP can be restarted using the new package Package installed and running BGP OSPF … IGMP PIM … New SNMP Package … CLI Intelligent Service Agent Other 58 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Software Architecture • Each protocol runs as a separate process • Uses multiple POSIX threads for scheduling tasks • Uses private memory for local data structures • Uses well documented APIs to service other processes • Uses shared-memory when offering read-only API service to other processes • Run-to-completion thread scheduling • Table managers run as separate processes 59 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

IPC Example URIB OSPF sa URIB API IP C m es URIB API - Packet arrives - Route lookup - Uses direct read R/W ge - Learns Route - Adds to URIB - Uses mq IPC R/W IP - URIB writes to memory d t c ire OSPF R/W URIB D a re IP routing table Shared-Memory 60 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Programmable VLSI Forwarding Engine • Facilitates line-rate forwarding of IPv 4, IPv 6, and MPLS traffic • New services with software downloads rather than hardware upgrades • Support for IPv 6 • TTL checking in hardware (for ACLs, GTSM …) • Multiple priority queues for RP destined traffic • capable of modifying queue priority for various types of control traffic • Multicast (PIM-SM) support does not need special media-adaptors 61 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

CLI Highlights • Familiar ‘look and feel’ reduces Op. Ex • Operations mode examples show ip bgp summary show ip interface brief show ip ospf neighbors show isis database show ip mroute • Configuration mode examples router bgp 100 log-neighbor-changes neighbor 10. 1. 1. 1 remote-as 200 dont-capability-negotiate address-family ipv 4 unicast policy remove-martians in • Support for deferred configuration “commits” 62 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Other salient software features • Powerful policy specification framework • intuitive syntax - similar to structured programming languages • support for “chaining” actions • Service oriented, modular Qo. S configuration • Dynamic RP queue prioritization for known BGP peers • Ability to constrain debug output using “debug-filters” • Conservative defaults • unnecessary services are disabled (only ‘ssh’ is on) • Digitally signed software packages for verifying source and integrity of contents • Intelligent service-agent for pro-active health monitoring 63 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Procket PRO/ Silicon Technology Highest performance and density 960 Gbps 2 per rack Ultra-compact 80 Gbps rack 8 per 64 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Procket PRO/ Silicon Technology 65 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Procket PRO/ Silicon Technology • World’s fastest packet processors • First 40 Gbps network processor (2002) • Record bandwidth density • 6 -chip family • Most flexible platform • Unmatched programmability enables new services • Long lifetime • Enhanced reliability • Highest level of silicon integration NPU 214 Million Transistors 66 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Never underestimate the power of Moore’s Law SC LCU 297 sq mm (17. 26 mm x 17. 26 mm) 30. 5 M transistors 47 M contacts 50 KBytes of memory 425 sq mm (20. 17 mm x 21. 07 mm) 137 M transistors 188 M contacts 950 KBytes of memory Striper 429 sq mm (20. 17 mm x 21. 29 mm) 156 M transistors 265 M contacts 1. 2 MBytes of memory NPU Architecture 429 sq mm (20. 17 mm x 21. 29 mm) 214 M transistors 400 M contacts 2. 6 MBytes of memory MCU GA 389 sq mm (19. 05 mm x 20. 4 mm) 106 M transistors 188 M contacts 1. 2 MBytes of memory 225 sq mm (15. 02 mm x 15. 02 mm) 83 M transistors 136 M contacts 900 KBytes of memory 67 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

40 Gbps NPU • • VLIW systolic array 375 MHz 125 Mpps 2856 min ops/packet 37 min loads/packet 255 meters 256 K GPCID Programmable features • Parsing, Lookup, PCL QOS, Accounting, IPv 6 68 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

40 Gbps NPU VLIW systolic Array • Packet advances every cycle • Named bypassing • > 200 processors Px. U Lx. U FTSRAM PBU pacman Qx. U • 4 ops/cycle/processor • 12 loads every cycle • (1 Tb memory BW) IPA 69 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

40 Gbps NPU VLIW systolic Array • Normal instruction set • • Px. U Arithmetic Logical Lx. U FTSRAM Branch IPA PBU Load pacman • Simple programming model • Deterministic performance Qx. U 70 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Memory Controller – Service Level Queueing • High bandwidth • 16 DRAM chips • independent banks TQ HQ • BW across banks • 36 K queues • Memory management • Write-once multicast • Preserve ordering MLT MQCC COHB SCIB 71 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

PRO/Silicon Technology Advanced Silicon Development Max Density Max Speed Max Reliability Min Power Min Cost Procket facilities provide complete control over chip design Architecture Logic ASIC designer Gates Layout Fab Custom Package ASIC vendor 72 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

Procket PRO/ Silicon Technology Reliable High integration Space efficient Highest Density Architecture OPEX savings Highest power Efficiency Future proof Full programmability 73 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.

PRO/8000 TM Concurrent Services Routers Highest performance and density 960 Gbps 2 per rack Ultra-compact 80 Gbps rack 8 per 74 CONFIDENTIAL © 2004 Procket Networks, Inc. All rights reserved.
- Slides: 74