Analysis of a Packet Switch with Memories Running


























- Slides: 26
Analysis of a Packet Switch with Memories Running Slower than the Line Rate Sundar Iyer, Amr Awadallah, Nick Mc. Keown (sundaes, aaa, nickm)@stanford. edu Departments of Electrical Engineering & Computer Science, Stanford University http: //klamath. stanford. edu/pps
Problem Statement Motivation: To design an extremely high speed packet switch. Stanford University 2
A Multi-Terabit OQ Switch Line Rate OC 3072 4 XOC 768 1 2 63 160 Gb/s 1 cell/2 ns 1 64 Stanford University 2 Output Queued Switch Line Rate OC 3072 4 XOC 768 63 64 160 Gb/s 1 cell/2 ns 3
Main Problem Conventional SRAM Buffer Memory 3 21 Write Rate R Read Rate R 1 cell in 2 ns 21 How to buffer cells in a memory at a rate of 1 ns ? Stanford University 4
Problem Statement Redefined Motivation: To design an extremely high speed packet switch with memories running slower than the line rate. This talk: Is about the analysis of an obvious approach. Stanford University 5
Architecture of a PPS Definition: A PPS is comprised of multiple identical lower-speed packet-switches operating independently and in parallel. An incoming stream of packets is spread, packet-by-packet, by a demultiplexor across the slower packet-switches, then recombined by a multiplexor at the output. Stanford University 6
Architecture of a PPS Demultiplexor R (R/k) 1 OQ Switch (R/k) 1 1 OQ Switch 2 Demultiplexor R 2 3 OQ Switch Demultiplexor R (R/k) R Multiplexor R N=4 k=3 N=4 R Multiplexor 2 3 Stanford University R Multiplexor Demultiplexor R Multiplexor (R/k) 7
Parallel Packet Switch Questions • Can it behave like a single big output queued switch? • Can it provide delay guarantees, strictpriorities, WFQ, …? Stanford University 8
Precise Emulation of an OQ Switch R R R Stanford University R R PPS =? R Yes No R 9
Emulation Scenario Layer 1 R 2 541 R/3 Layer 2 2 2 3 R/3 Layer 3 3 R 1 2 22 3 3 N=4 R 154 2 3 R 541 1 1 4 5 R 5 3 R 4 32 1 R R R/3 Stanford University 10
Why is there no Choice at the Input ? R 1 1 41 2 2 2 1 R 4 j R Layer 1 1 j j 45 j j 1 5 1 2 2 R 2 4 j j 2 Layer 2 R 3 3 4 3 3 R 4 Stanford University N=4 R Layer 3 3 N=4 R 11
Result of no Choice Layer 1 R 45 j j 1 R 2 R 3 1 2 3 Stanford University Layer 2 1 2 N=4 R R 2 3 R Layer 3 3 R 4 541 N=4 R 12
How does one Increase Choice ? Speedup R 45 j j 1 1 Layer 1 41 (2 R/3) 1 R 5 j 2 3 N=4 (2 R/3) R Layer 3 3 R Stanford University R 52 j 3 Layer 2 j R R (2 R/3) 13
Effect of Speedup on Choice Layer 1 2 R/k Layer 2 R A speedup of S= 2, Layer 9 with k= 10 links Layer 10 Stanford University 14
Definition • Available Input Link Set (AIL) AIL(i, n) is the set of layers to which external input port i can start writing a cell to, at time slot n. Stanford University 15
Definition • Departure Time of a Cell (n’) The departure time of a cell, n’, is the time it would have departed from an equivalent FIFO OQ switch. Stanford University 16
Definition • Available Output Link Set (AOL) AOL(j, n’) is the set of layers that output j can start reading a cell from, at time slot n’. Stanford University 17
Main Observation • Inputs can only send to the AIL set. • Outputs can only read from the AOL set. Layer 1 R 5 j 1 R 41 (2 R/3) j 2 Layer 2 1 2 j 2 R R 3 j R (2 R/3) 3 Layer 3 3 R N=4 (2 R/3) Stanford University (2 R/3) 18
Lower Bounds on Choice Sets Minimum size of AIL, AOL: |AIL|, |AOL| Stanford University >= Total – links = k - Maximum number of links which can have cells in progress ( k/S - 1 ) 19
Assurance of Choice • A cell must be sent to a link which belongs to both the AIL and the AOL set. Stanford University 20
Parallel Packet Switch Results • If S > 2 k/(k+2) @ 2 then each cell is guaranteed to find a layer that belongs to both the AIL and AOL sets. • If S > 2 k/(k+2) @ 2 then a PPS can precisely emulate a FIFO output queued switch for all traffic patterns. Stanford University 21
Precise Emulation of an OQ Switch R R R Stanford University R R PPS R =? Yes No R 22
Parallel Packet Switch Results • If S > 3 k/(k+3) @ 3 then a PPS can precisely emulate an OQ switch with WFQ or strict priorities for all traffic patterns. Stanford University 23
Is this Practical ? • NO • There are two reasons: 1) Maintaining • AIL - That is easy. • AOL - That is not. 2) Packet order is decided by the output. Stanford University 24
A Practical Distributed Algorithm • If S > 2 k/(k+2) @ 2 then a PPS with distributed AOL can precisely emulate a FIFO output queued switch for all traffic patterns. The PPS will have a fixed latency of Nk/S time slots. Stanford University 25
Conclusions – Its possible to design a high speed single stage packet switch from multiple slower speed packet switches. – Such a switch can emulate an OQ switch. – There remain a couple of open questions • Making Qo. S practical. • Making multicasting practical. – This is just the first step towards scaleable switch fabrics. Stanford University 26