CSE 291 a Interconnection Networks Lecture 15 Router

  • Slides: 17
Download presentation
CSE 291 -a Interconnection Networks Lecture 15: Router (cont’d) March 5, 2007 Prof. Chung-Kuan

CSE 291 -a Interconnection Networks Lecture 15: Router (cont’d) March 5, 2007 Prof. Chung-Kuan Cheng CSE Dept, UC San Diego Winter 2007 Transcribed by Ling Zhang

Topics o Router (cont’d) n o o Output States Router Pipelines and Stalls Router

Topics o Router (cont’d) n o o Output States Router Pipelines and Stalls Router Datapath Components n n n Input Buffer Switches Output Buffer

Router o Output virtual channel state fields: n n n G: Global state o

Router o Output virtual channel state fields: n n n G: Global state o I: idle o A: active o C: Waiting for credit I: Input VC o Input port and virtual channel that are forwarding flits to this output virtual channel. C: Credit Count o Number of free buffers available to hold flits from this virtual channel at the downstream node.

Router Pipeline o Each head flit must proceed through: n n o RC: routing

Router Pipeline o Each head flit must proceed through: n n o RC: routing computation VA: virtual channel allocation SA: switch allocation ST: switch traversal For body flit and tail flit: n Only SA and ST are needed.

An example of router pipeline cycles 1 2 3 4 HF RC VA SA

An example of router pipeline cycles 1 2 3 4 HF RC VA SA ST B 1 B 2 TF SA 5 6 7 ST SA ST

Possible stalls in router pipeline o Packet stalls n n n o VC Busy:

Possible stalls in router pipeline o Packet stalls n n n o VC Busy: The head flit for one packet arrives before the tail flit of the previous packet has completed switch allocation. Route: Routing not completed. VA: VA not successful. Flit stalls n n n Switch busy: Switch allocation attempted but unsuccessful. Buffer empty: No flit available. Input buffer is empty. Credit: No credit available.

VA busy stall example Virtual channel 0 is busy. Packet B holds it. cycle

VA busy stall example Virtual channel 0 is busy. Packet B holds it. cycle s 1 2 3 4 5 6 HF(A) RC x x VA SA ST TF(B) B 1(A) SA 7 ST

Switch busy stall example B 2 fails to allocation switch in cycle 5. cycles

Switch busy stall example B 2 fails to allocation switch in cycle 5. cycles 1 2 3 4 HF RC VA SA ST B 1 B 2 B 3 SA 5 6 7 SA ST x SA 8 ST x ST

Buffer empty stall example B 2 comes in late, and introduce 1 cycle stall.

Buffer empty stall example B 2 comes in late, and introduce 1 cycle stall. cycles 1 2 3 4 HF RC VA SA ST B 1 B 2 B 3 SA 5 6 7 SA ST 8 ST x SA ST

Credit stall example 1 2 3 4 5 6 7 8 9 10 11

Credit stall example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 credit 4 3 2 1 0 0 0 0 1 1 0 0 0 HF SA ST W 1 W 2 RC VA SA ST W 1 W 2 SA ST W 1 W 2 SA B 1 B 2 B 3 SA SA W 2 C of HF B 4 x x x ST SA ST CT W 1 W 2 CU x x x X SA CT W 1 W 2 CU x x x X C of B 1 B 5 ST x ST

Credit stall example o o o W 1, W 2 is the 2 cycles

Credit stall example o o o W 1, W 2 is the 2 cycles of time of flights between two routers. A buffer is allocated to the headflit when it is in the upstream SA stage in cycle 1. This buffer cannot be reassigned to another flit until after the head flit leaves the downstream SA stage, freeing the buffer, and a credit reflecting the free buffer propagates back to the update stage in cycle 11. Body flit 4 uses this credit to enter the SA stage in cycle 12. tcrt = tf+tc+2 Tw+1 n n tf: flit pipeline delay, which is 4 cycles. tc: credit pipeline delay, which is 2 cycles. Tw: one way wire delay, which is 2 cycles. The total delay is 11 cycles.

Usage of output virtual channel HF(A) 1 2 3 4 5 6 7 8

Usage of output virtual channel HF(A) 1 2 3 4 5 6 7 8 9 10 RC VA SA ST W 1 W 2 TF(A) SA C of TF Conservative approach HF(B) RC x x x C of HF Approach For few stalls HF(B) RC VA SA ST 11 12 13 14 CT W 1 W 2 CU x x x VA SA ST 15 16 17 VA SA ST ST x x CT W 1 W 2 CU W 1 W 2 x RC

Usage of output virtual channel o o The conservative approach is to wait until

Usage of output virtual channel o o The conservative approach is to wait until the downstream flit buffer for the virtual channel is completely empty, as indicated by the arrival of the credit from the tail flit. This avoids creating a dependency between the current packet and a packet occupying the downstream buffer. If dependency is affordable, the virtual channel can be reallocated as soon as the tail flit of the previous packet completes the SA stage.

Router Datapath Components o Input buffer n Central memory o n Separated buffer o

Router Datapath Components o Input buffer n Central memory o n Separated buffer o n n Good usage, but long latency and small bandwidth Inefficient usage, but good latency and bandwidth Separated buffer for each channel Multi-port memory

Router Datapath Components o Switch n Input speedup by splitting the input

Router Datapath Components o Switch n Input speedup by splitting the input

Switch o o If k inputs are splitted into sk inputs, the throughput of

Switch o o If k inputs are splitted into sk inputs, the throughput of the switch is: If a switch has both input and output speedup, the throughput can be larger than one:

Router Datapath Components o Output buffers n FIFO buffer with length of 2 -4

Router Datapath Components o Output buffers n FIFO buffer with length of 2 -4 flits is often sufficient to match the speed between the switch and the channel.