EE 384 x Packet Switch Architectures Handout 1

  • Slides: 53
Download presentation
EE 384 x: Packet Switch Architectures Handout 1: Logistics and Introduction Professor Balaji Prabhakar

EE 384 x: Packet Switch Architectures Handout 1: Logistics and Introduction Professor Balaji Prabhakar balaji@isl. stanford. edu Professor Nick Mc. Keown nickm@stanford. edu 1

Outline This two course sequence is about theory and practice of designing packet switches

Outline This two course sequence is about theory and practice of designing packet switches and Internet routers. 1. 2. Introduction: What is a packet switch? The evolution of Internet routers, their basic architectural components, and some example architectures. Part I: Output Queued Switches (Emphasis on Deterministic Analysis) OQ as the simplest and ideal architecture. Output queueing and shared-memory switches. Packet arrival processes: (s, r)-constrained arrivals, leaky buckets, Bernoulli arrivals, bursty arrivals, adversaries. Providing bandwidth and delay guarantees, scheduling, fairness, Fair. Queueing, Generalized Processor Sharing and Deficit Round Robin. Practical difficulties: When output queued switches are impractical. Memory bandwidth and capacity scaling. Some approaches: Emulating output queued switches. Parallel packet buffers as standalone shared memory, with design examples. Routers with a single stage of buffering and constraint sets, Parallel Shared Memory Routers, Distributed Shared Memory Routers, and Parallel Packet Switches. Output link scheduling in a Distributed Shared Memory router. Combined input and output queued (CIOQ) switches, stable marriage matchings. Winter 2004 EE 384 x Handout 1 2

Outline 3. Part II: Input Queued Switches (Emphasis on Probabilistic Analysis). What is an

Outline 3. Part II: Input Queued Switches (Emphasis on Probabilistic Analysis). What is an input-queued (IQ) switch? Definition of IQ switch with single FCFS queue. Switching fabrics, crossbars. Head of line blocking. The balls and bins model. Proof of Karol's 2 -sqrt(2) (58%) result. Virtual output queues and crossbar schedulers. Bipartite Matchings: Maximum Sized Matchings, Maximum Weight Matchings, maximal matchings. Definitions of 100% throughput. When traffic is uniform: simple RR and random matchings. When traffic matrix is known: Birkhoff - von Neuman decomposition. When traffic is not known: heuristics. PIM, i. SLIP, WFA. 4. Fundamentals (Review Sessions): Introduction to probability, Poisson process, Discrete and Continuous-time Markov chains. Basic queueing theory: M/M/1, M/G/1, Little’s result, PASTA. Winter 2004 EE 384 x Handout 1 3

Outline EE 384 y 1. Part II: Input Queued Switches (Continued). Intro to Lyapunov

Outline EE 384 y 1. Part II: Input Queued Switches (Continued). Intro to Lyapunov functions, proof that max weight matching gives 100% throughput. Some case studies: The Tiny Tera architecture. The Cisco GSR 12000. 2. Part III: Other Switch Architectures Buffered crossbars. Scaling crossbars and parallelism. Multistage switches: Clos networks, 2 -stage switches (random and deterministic). 3. Part IV: Other Switch Functions Address Lookup: Exact matches, longest prefix matches, performance metrics, hardware and software solutions. Packet Classification: For firewalls, Qo. S, and policy-based routing; graphical description and examples of 2 -D classification, examples of classifiers, theoretical and practical considerations. 4. Special topics. Project presentations. 5. Winter 2004 EE 384 x Handout 1 4

Some logistics Web page: http: //www. stanford. edu/class/ee 384 x Office hours: Balaji Prabhakar

Some logistics Web page: http: //www. stanford. edu/class/ee 384 x Office hours: Balaji Prabhakar – TBA on web page Nick Mc. Keown – Wednesday 4. 00 pm – 5. 00 pm Course assistant: Theresa Wan – twan@csl. stanford. edu TAs: Gates 351; Tel: (650) 725 9077 Rui Zhang-Shen – rzhang@stanford. edu Abtin Keshavarzian- abtink@stanford. edu Office hours: See class web page. Grades: You need to sign up with “eeclass” on the EE 384 x web page. Winter 2004 EE 384 x Handout 1 5

More Logistics Prerequisite v EE 284/CS 244 A and familiarity with probability. v Stats

More Logistics Prerequisite v EE 284/CS 244 A and familiarity with probability. v Stats 116 (or EE 178, EE 278) and CS 161 v URLs to all the papers are on the eeclass web page. Useful Papers Grading v v (40%) 5 Problem sets (10%) Several surprise quizzes (20%) In-class midterm exam (Tuesday February 17) (30%) Final exam (Wednesday March 17) SITN Students v v Same schedule as in-class students Fax your assignment to us at: (702) 977 0556. All deadlines are hard! Winter 2004 EE 384 x Handout 1 6

An Introduction The class starts here! Background v v v What is a router?

An Introduction The class starts here! Background v v v What is a router? Why do we need faster routers? Why are they hard to build? Architectures and techniques v v The evolution of router architecture. IP address lookup. Packet buffering. Switching. Winter 2004 EE 384 x Handout 1 7

What is Routing? R 3 R 1 A R 4 D B E D

What is Routing? R 3 R 1 A R 4 D B E D C Winter 2004 R 2 Destination Next Hop D R 3 E R 3 F R 5 EE 384 x Handout 1 R 5 F 8

What is Routing? R 3 R 1 A 1 4 Ver 20 bytes B

What is Routing? R 3 R 1 A 1 4 Ver 20 bytes B R 4 32 16 HLen T. Service Fragment ID TTL Total Packet Length D Protocol Header Checksum Destination Next Hop Destination Address Winter 2004 E Flags Fragment Offset Source Address R 2 C D D Options (if any)R 3 E R 3 Data F R 5 EE 384 x Handout 1 R 5 F 9

What is Routing? R 3 R 1 A R 4 D B E R

What is Routing? R 3 R 1 A R 4 D B E R 2 C Winter 2004 R 5 EE 384 x Handout 1 F 10

Points of Presence (POPs) POP 2 POP 1 A POP 3 POP 4 B

Points of Presence (POPs) POP 2 POP 1 A POP 3 POP 4 B D E POP 5 POP 6 C Winter 2004 POP 7 EE 384 x Handout 1 POP 8 F 11

Where High Performance Routers are Used (2. 5 Gb/s) R 1 R 2 R

Where High Performance Routers are Used (2. 5 Gb/s) R 1 R 2 R 5 R 4 R 3 R 8 R 9 R 10 Winter 2004 R 7 R 11 R 14 R 13 (2. 5 Gb/s) R 6 R 15 EE 384 x Handout 1 (2. 5 Gb/s) R 12 R 16 12 (2. 5 Gb/s)

What a Router Looks Like Cisco GSR 12416 Juniper M 160 19” Capacity: 160

What a Router Looks Like Cisco GSR 12416 Juniper M 160 19” Capacity: 160 Gb/s Power: 4. 2 k. W 6 ft Winter 2004 Capacity: 80 Gb/s Power: 2. 6 k. W 3 ft 2. 5 ft EE 384 x Handout 1 13

Basic Architectural Components of an IP Router Routing Protocols Routing Table Control Plane Forwarding

Basic Architectural Components of an IP Router Routing Protocols Routing Table Control Plane Forwarding Switching Table Winter 2004 EE 384 x Handout 1 Datapath per-packet processing 14

Per-packet processing in an IP Router 1. Accept packet arriving on an incoming link.

Per-packet processing in an IP Router 1. Accept packet arriving on an incoming link. 2. Lookup packet destination address in the forwarding table, to identify outgoing port(s). 3. Manipulate packet header: e. g. , decrement TTL, update header checksum. 4. Send packet to the outgoing port(s). 5. Buffer packet in the queue. 6. Transmit packet onto outgoing link. Winter 2004 EE 384 x Handout 1 15

Generic Router Architecture Header Processing Data Hdr Lookup Update IP Address Header IP Address

Generic Router Architecture Header Processing Data Hdr Lookup Update IP Address Header IP Address ~1 M prefixes Off-chip DRAM Winter 2004 Queue Packet Data Hdr Next Hop Address Table Buffer Memory EE 384 x Handout 1 ~1 M packets Off-chip DRAM 16

Generic Router Architecture Data Hdr Header Processing Lookup IP Address Buffer Manager Update Header

Generic Router Architecture Data Hdr Header Processing Lookup IP Address Buffer Manager Update Header Buffer Memory Address Table Data Hdr Header Processing Lookup IP Address Buffer Manager Update Header Winter 2004 Data Memory. Hdr Header Processing Lookup IP Address Table Data Hdr Buffer Manager Update Header EE 384 x Handout 1 Buffer Memory 17

Why do we Need Faster Routers? 1. 2. To prevent routers becoming the bottleneck

Why do we Need Faster Routers? 1. 2. To prevent routers becoming the bottleneck in the Internet. To increase POP capacity, and to reduce cost, size and power. Winter 2004 EE 384 x Handout 1 18

Why we Need Faster Routers 1: To prevent routers from being the bottleneck Packet

Why we Need Faster Routers 1: To prevent routers from being the bottleneck Packet processing Power Link Speed 10000 2 x / 7 months 100 10 1 1985 1990 1995 2000 0, 1 TDM Source: SPEC 95 Int Winter 2004 & David Miller, Stanford. EE 384 x Handout 1 DWDM 19 Fiber Capacity (Gbit/s) 2 x / 18 months 1000

Why we Need Faster Routers 2: To reduce cost, power & complexity of POPs

Why we Need Faster Routers 2: To reduce cost, power & complexity of POPs POP with smaller routers POP with large routers v v Winter 2004 Ports: Price >$100 k, Power > 400 W. It is common for 50 -60% of ports to be for interconnection. EE 384 x Handout 1 20

Why are Fast Routers Difficult to Make? 1. It’s hard to keep up with

Why are Fast Routers Difficult to Make? 1. It’s hard to keep up with Moore’s Law: Ø Ø Winter 2004 The bottleneck is memory speed. Memory speed is not keeping up with Moore’s Law. EE 384 x Handout 1 21

Why are Fast Routers Difficult to Make? Speed of Commercial DRAM 1. It’s hard

Why are Fast Routers Difficult to Make? Speed of Commercial DRAM 1. It’s hard to keep up with Moore’s Law: Ø Ø The bottleneck is memory speed. Memory speed is not keeping up with Moore’s Law. 1. 1 x / 18 months Moore’s Law 2 x / 18 months Winter 2004 EE 384 x Handout 1 22

Why are Fast Routers Difficult to Make? 1. It’s hard to keep up with

Why are Fast Routers Difficult to Make? 1. It’s hard to keep up with Moore’s Law: Ø Ø 2. The bottleneck is memory speed. Memory speed is not keeping up with Moore’s Law is too slow: Ø Winter 2004 Routers need to improve faster than Moore’s Law. EE 384 x Handout 1 23

Router Performance Exceeds Moore’s Law Growth in capacity of commercial routers: Ø Ø Ø

Router Performance Exceeds Moore’s Law Growth in capacity of commercial routers: Ø Ø Ø Capacity 1992 ~ 2 Gb/s Capacity 1995 ~ 10 Gb/s Capacity 1998 ~ 40 Gb/s Capacity 2001 ~ 160 Gb/s Capacity 2003 ~ 640 Gb/s Average growth rate: 2 x / 18 months. Winter 2004 EE 384 x Handout 1 24

Outline Background Ø What is a router? Ø Why do we need faster routers?

Outline Background Ø What is a router? Ø Why do we need faster routers? Ø Why are they hard to build? Architectures and techniques Ø Ø Winter 2004 The evolution of router architecture. IP address lookup. Packet buffering. Switching. EE 384 x Handout 1 25

First Generation Routers Shared Backplane Li CP n I U nt e er fa

First Generation Routers Shared Backplane Li CP n I U nt e er fa ce M em or y CPU Route Table Buffer Memory Line Interface MAC MAC Typically <0. 5 Gb/s aggregate capacity Winter 2004 EE 384 x Handout 1 26

Second Generation Routers CPU Route Table Buffer Memory Line Card Buffer Memory Fwding Cache

Second Generation Routers CPU Route Table Buffer Memory Line Card Buffer Memory Fwding Cache MAC MAC Typically <5 Gb/s aggregate capacity Winter 2004 EE 384 x Handout 1 27

Third Generation Routers Switched Backplane Li I CP n ne Ute rf ac M

Third Generation Routers Switched Backplane Li I CP n ne Ute rf ac M e em or y Line Card CPU Card Line Card Local Buffer Memory Routing Table Local Buffer Memory Fwding Table MAC Typically <50 Gb/s aggregate capacity Winter 2004 EE 384 x Handout 1 28

Fourth Generation Routers/Switches Optics inside a router for the first time Optical links 100

Fourth Generation Routers/Switches Optics inside a router for the first time Optical links 100 s of metres Switch Core Winter 2004 Linecards Handout 1 in development 0. 3 - 10 Tb/s. EE 384 x routers 29

Outline Background Ø What is a router? Ø Why do we need faster routers?

Outline Background Ø What is a router? Ø Why do we need faster routers? Ø Why are they hard to build? Architectures and techniques Ø Ø Winter 2004 The evolution of router architecture. IP address lookup. Packet buffering. Switching. EE 384 x Handout 1 30

Generic Router Architecture Header Processing Lookup IP Address Buffer Manager Update Header Address Table

Generic Router Architecture Header Processing Lookup IP Address Buffer Manager Update Header Address Table Buffer Memory Header Processing Lookup IP Address Buffer Manager Update Header Buffer Memory Address Table Header Processing Lookup IP Address Winter 2004 Address Table Buffer Manager Update Header EE 384 x Handout 1 Buffer Memory 31

IP Address Lookup Why it’s thought to be hard: 1. 2. 3. Winter 2004

IP Address Lookup Why it’s thought to be hard: 1. 2. 3. Winter 2004 It’s not an exact match: it’s a longest prefix match. The table is large: about 150, 000 entries today, and growing. The lookup must be fast: about 30 ns for a 10 Gb/s line. EE 384 x Handout 1 32

IP Lookups find Longest Prefixes 128. 9. 176. 0/24 128. 9. 16. 0/21 128.

IP Lookups find Longest Prefixes 128. 9. 176. 0/24 128. 9. 16. 0/21 128. 9. 172. 0/21 65. 0. 0. 0/8 0 128. 9. 0. 0/16 128. 9. 16. 14 142. 12. 0. 0/19 232 -1 Routing lookup: Find the longest matching prefix (aka the most specific route) among all prefixes that match the destination address. Winter 2004 EE 384 x Handout 1 33

IP Address Lookup Why it’s thought to be hard: 1. 2. 3. Winter 2004

IP Address Lookup Why it’s thought to be hard: 1. 2. 3. Winter 2004 It’s not an exact match: it’s a longest prefix match. The table is large: about 150, 000 entries today, and growing. The lookup must be fast: about 30 ns for a 10 Gb/s line. EE 384 x Handout 1 34

Address Tables are Large Source: http: //www. cidr-report. org/ Winter 2004 EE 384 x

Address Tables are Large Source: http: //www. cidr-report. org/ Winter 2004 EE 384 x Handout 1 35

IP Address Lookup Why it’s thought to be hard: 1. 2. 3. Winter 2004

IP Address Lookup Why it’s thought to be hard: 1. 2. 3. Winter 2004 It’s not an exact match: it’s a longest prefix match. The table is large: about 150, 000 entries today, and growing. The lookup must be fast: about 30 ns for a 10 Gb/s line. EE 384 x Handout 1 36

Lookups Must be Fast Winter 2004 Year Line 40 B packets (Mpkt/s) 1997 622

Lookups Must be Fast Winter 2004 Year Line 40 B packets (Mpkt/s) 1997 622 Mb/s 1. 94 1999 2. 5 Gb/s 7. 81 2001 10 Gb/s 31. 25 2003 40 Gb/s 125 EE 384 x Handout 1 37

Outline Background Ø What is a router? Ø Why do we need faster routers?

Outline Background Ø What is a router? Ø Why do we need faster routers? Ø Why are they hard to build? Architectures and techniques Ø Ø Winter 2004 The evolution of router architecture. IP address lookup. Packet buffering. Switching. EE 384 x Handout 1 38

Generic Router Architecture Header Processing Lookup IP Address Buffer Queue Manager Packet Update Header

Generic Router Architecture Header Processing Lookup IP Address Buffer Queue Manager Packet Update Header Buffer Memory Address Table Header Processing Lookup IP Address Winter 2004 Address Table Queue Buffer Packet Manager Update Header EE 384 x Handout 1 Buffer Memory 39

Fast Packet Buffers Example: 40 Gb/s packet buffer Size = RTT*BW = 10 Gb;

Fast Packet Buffers Example: 40 Gb/s packet buffer Size = RTT*BW = 10 Gb; 40 byte packets Write Rate, R 1 packet every 8 ns Buffer Manager Read Rate, R 1 packet every 8 ns Buffer Memory Use SRAM? Use DRAM? + fast enough random access time, but - too low density to store 10 Gb of data. + high density means we can store data, but - too slow (50 ns random access time). Winter 2004 EE 384 x Handout 1 40

Outline Background Ø What is a router? Ø Why do we need faster routers?

Outline Background Ø What is a router? Ø Why do we need faster routers? Ø Why are they hard to build? Architectures and techniques Ø Ø Winter 2004 The evolution of router architecture. IP address lookup. Packet buffering. Switching. EE 384 x Handout 1 41

Generic Router Architecture Data Hdr Header Processing Lookup IP Address Update Header 1 1

Generic Router Architecture Data Hdr Header Processing Lookup IP Address Update Header 1 1 Buffer Memory Address Table Data Hdr Header Processing Lookup IP Address Queue Packet Update Header 2 2 NQueue times line rate Packet Buffer Memory Address Table N times line rate Data Hdr Winter 2004 Header Processing Lookup IP Address Table Update Header N N EE 384 x Handout 1 Queue Packet Buffer Memory 42

Generic Router Architecture Data Hdr Header Processing Lookup IP Address Update Header Address Table

Generic Router Architecture Data Hdr Header Processing Lookup IP Address Update Header Address Table Data Hdr Update Header Winter 2004 Address Table Queue Packet 2 2 Data Hdr Buffer Memory Header Processing Lookup IP Address 1 Buffer Address Table Data Hdr 1 Data Memory. Hdr Header Processing Lookup IP Address Queue Packet Update Header Queue Packet Scheduler N N Buffer Data Memory. Hdr EE 384 x Handout 1 43

A Router with Input Queues The best that any queueing system can achieve. Winter

A Router with Input Queues The best that any queueing system can achieve. Winter 2004 EE 384 x Handout 1 44

A Router with Input Queues Head of Line Blocking The best that any queueing

A Router with Input Queues Head of Line Blocking The best that any queueing system can achieve. Winter 2004 EE 384 x Handout 1 45

Head of Line Blocking Winter 2004 EE 384 x Handout 1 46

Head of Line Blocking Winter 2004 EE 384 x Handout 1 46

Virtual Output Queues Winter 2004 EE 384 x Handout 1 47

Virtual Output Queues Winter 2004 EE 384 x Handout 1 47

A Router with Virtual Output Queues The best that any queueing system can achieve.

A Router with Virtual Output Queues The best that any queueing system can achieve. Winter 2004 EE 384 x Handout 1 48

Maximum Weight Matching A 1(n) A 11(n) L 11(n) S*(n) 1 1 D 1(n)

Maximum Weight Matching A 1(n) A 11(n) L 11(n) S*(n) 1 1 D 1(n) A 1 N(n) AN(n) DN(n) AN 1(n) N ANN(n) N LNN(n) L 11(n) Maximum Weight Match LN 1(n) “Request” Graph. EE 384 x Handout 1 Winter 2004 Bipartite Match 49

Outline of Proof Winter 2004 EE 384 x Handout 1 50

Outline of Proof Winter 2004 EE 384 x Handout 1 50

There are now many ways to achieve 100% throughput… Winter 2004 EE 384 x

There are now many ways to achieve 100% throughput… Winter 2004 EE 384 x Handout 1 51

The Evolution of Switching Different weight functions, incomplete information, pipelining. Theory: Input Queueing (IQ)

The Evolution of Switching Different weight functions, incomplete information, pipelining. Theory: Input Queueing (IQ) IQ + VOQ, Maximum weight matching 58% [Karol, 1987] 100% [M et al. , 1995] 100% [Various] Randomized algorithms 100% [Tassiulas, 1998] IQ + VOQ, Maximal size matching, Speedup of two. 100% [Dai & Prabhakar, 2000] Practice: Input Queueing (IQ) Winter 2004 IQ + VOQ, Sub-maximal size matching e. g. PIM, i. SLIP. EE 384 x Handout 1 Various heuristics, distributed algorithms, and amounts of speedup 52

Current Internet Router Technology Summary v There are three potential bottlenecks: Ø Address lookup,

Current Internet Router Technology Summary v There are three potential bottlenecks: Ø Address lookup, Ø Packet buffering, and Ø Switching. v Techniques exist today for: Ø 10+Tb/s Internet routers, with Ø 40 Gb/s linecards. Winter 2004 EE 384 x Handout 1 53