TCP Details Roadmap r Congestion Control Causes Symptoms

  • Slides: 49
Download presentation
TCP Details: Roadmap r Congestion Control: Causes, Symptoms, Approaches to Dealing With r Slow

TCP Details: Roadmap r Congestion Control: Causes, Symptoms, Approaches to Dealing With r Slow Start/ Congestion Avoidance r TCP Fairness r TCP Performance r Transport Layer Wrap-up 3: Transport Layer 1

Principles of Congestion Control Congestion: r informally: “too many sources sending too much data

Principles of Congestion Control Congestion: r informally: “too many sources sending too much data too fast for network to handle” r different from flow control! r a top-10 problem! 3: Transport Layer 2

Congestion Signals r Lost packets: If there are more packets than resources (ex. Buffer

Congestion Signals r Lost packets: If there are more packets than resources (ex. Buffer space) along some path, then no choice but to drop some r Delayed packets: Router queues get full and packets wait longer for service r Explicit notification: Routers can alter packet headers to notify end hosts 3: Transport Layer 3

Congestion Collapse r As number of packets entering network increases, number of packets arriving

Congestion Collapse r As number of packets entering network increases, number of packets arriving at destination increases but only up to a point r Packet dropped in network => all the resources it used along the way are wasted => no forward progress r Internet 1987 3: Transport Layer 4

Congestion Prevention? r In a connection-oriented network: m Prevent congestion by requiring resources to

Congestion Prevention? r In a connection-oriented network: m Prevent congestion by requiring resources to be reserved in advance r In a connectionless network: m No prevention for congestion, just reaction to congestion (congestion control) 3: Transport Layer 5

Causes/costs of congestion: scenario 1 r two senders, two receivers r one router, infinite

Causes/costs of congestion: scenario 1 r two senders, two receivers r one router, infinite buffers r no retransmission r large delays when congested r maximum achievable throughput 3: Transport Layer 6

Causes/costs of congestion: scenario 2 r one router, finite buffers r sender retransmission of

Causes/costs of congestion: scenario 2 r one router, finite buffers r sender retransmission of lost packet 3: Transport Layer 7

Causes/costs of congestion: scenario 2 r l = l out(goodput) in r “perfect” retransmission

Causes/costs of congestion: scenario 2 r l = l out(goodput) in r “perfect” retransmission only when loss: r l > lout in retransmission of delayed (not lost) packet makes l in l (than perfect case) for same out larger “costs” of congestion: r more work (retrans) for given “goodput” r unneeded retransmissions: link carries multiple copies of pkt 3: Transport Layer 8

Causes/costs of congestion: scenario 3 r four senders r multihop paths r timeout/retransmit Q:

Causes/costs of congestion: scenario 3 r four senders r multihop paths r timeout/retransmit Q: what happens as l in and l increase ? in 3: Transport Layer 9

Causes/costs of congestion: scenario 3 Another “cost” of congestion: r when packet dropped, any

Causes/costs of congestion: scenario 3 Another “cost” of congestion: r when packet dropped, any “upstream transmission capacity used for that packet wasted! 3: Transport Layer 10

Approaches towards congestion control Two broad approaches towards congestion control: End-end congestion control: r

Approaches towards congestion control Two broad approaches towards congestion control: End-end congestion control: r no explicit feedback from network r congestion inferred from end-system observed loss, delay r approach taken by TCP Network-assisted congestion control: r routers provide feedback to end systems m single bit indicating congestion (SNA, DECbit, TCP/IP ECN, ATM) m explicit rate sender should send at 3: Transport Layer 11

Window Size Revised r Limit window size by both receiver advertised window *and* a

Window Size Revised r Limit window size by both receiver advertised window *and* a “congestion window” m Max. Window < = minimum (Receiver. Advertised Window, Congestion Window) m Effective. Window = Max Window - (Last Byte. Sent - Last. Byte. Acked) 3: Transport Layer 12

TCP Congestion Control r end-end control (no network assistance) r transmission rate limited by

TCP Congestion Control r end-end control (no network assistance) r transmission rate limited by congestion window size, Congwin, over segments: Congwin 3: Transport Layer 13

Original: With Just Flow Control Destination … Source 3: Transport Layer 14

Original: With Just Flow Control Destination … Source 3: Transport Layer 14

TCP Congestion Control: Two Phases r two “phases” m slow start m congestion avoidance

TCP Congestion Control: Two Phases r two “phases” m slow start m congestion avoidance r important variables: m Congwin: current congestion window m Threshold: defines threshold between two slow start phase, congestion control phase 3: Transport Layer 15

TCP congestion control: r “probing” for usable bandwidth: m m m ideally: transmit as

TCP congestion control: r “probing” for usable bandwidth: m m m ideally: transmit as fast as possible (Congwin as large as possible) without loss increase Congwin until loss (congestion) loss: decrease Congwin, then begin probing (increasing) again r Don’t just send the entire receiver’s advertised window worth of data right away r Start with a congestion window of 1 or 2 packets r Slow start: For each ack received, double window up until a threshold, then just increase by 1 r Congestion Avoidance: For each timeout, start back at 1 and halve the upper threshold 3: Transport Layer 16

“Slow” Start: Multiplicative Increase Source Destination Multiplicative Increase Up to the Threshold “Slower” than

“Slow” Start: Multiplicative Increase Source Destination Multiplicative Increase Up to the Threshold “Slower” than full receiver’s advertised window … Faster than additive increase 3: Transport Layer 17

TCP Congestion Avoidance: Additive Increase Source Destination … Additive Increase Past the Threshhold 3:

TCP Congestion Avoidance: Additive Increase Source Destination … Additive Increase Past the Threshhold 3: Transport Layer 18

TCP Congestion Avoidance: Multiplicative Decrease too Congestion avoidance /* slowstart is over */ /*

TCP Congestion Avoidance: Multiplicative Decrease too Congestion avoidance /* slowstart is over */ /* Congwin > threshold */ Until (loss event) { every w segments ACKed: Congwin++ } threshold = Congwin/2 Congwin = 1 1 perform slowstart 1: TCP Reno skips slowstart (fast recovery) after three duplicate ACKs 3: Transport Layer 19

Fast Retransmit r Interpret 3 duplicate acks as an early warning of loss (other

Fast Retransmit r Interpret 3 duplicate acks as an early warning of loss (other causes? Reordering or duplication in network) r As if timeout - Retransmit packet and set the slow-start threshold to half the amount of unacked data r Unlike timeout - set congestion window to the threshhold (not back to 1 like normal slow start) 3: Transport Layer 20

Fast Recovery r After a fast retransmit, do congestion avoidance but not slow start.

Fast Recovery r After a fast retransmit, do congestion avoidance but not slow start. r After third dup ack received: m threshold = ½ (congestion window) m Congestion window = threshold + 2* MSS r If more dup acks arrive: m congestion Window += MSS r When ack arrives for new data, deflate congestion window: m congestion. Window = threshold 3: Transport Layer 21

Connection Timeline r blue line = value of congestion window in KB r Short

Connection Timeline r blue line = value of congestion window in KB r Short hash marks = segment transmission r Long hash lines = time when a packet eventually retransmitted was first transmitted r Dot at top of graph = timeout r 0 -0. 4 Slow start; 2. 0 timeout, start back at 1; 2. 0 -4. 0 linear increase 3: Transport Layer 22

AIMD TCP congestion avoidance: r AIMD: additive increase, multiplicative decrease m m increase window

AIMD TCP congestion avoidance: r AIMD: additive increase, multiplicative decrease m m increase window by 1 per RTT decrease window by factor of 2 on loss event TCP Fairness goal: if N TCP sessions share same bottleneck link, each should get 1/N of link capacity TCP connection 1 TCP connection 2 bottleneck router capacity R 3: Transport Layer 23

Why is TCP fair? Two competing sessions: r Additive increase gives slope of 1,

Why is TCP fair? Two competing sessions: r Additive increase gives slope of 1, as throughout increases r multiplicative decreases throughput proportionally equal bandwidth share R Connection 2 throughput r loss: decrease window by factor of 2 congestion avoidance: additive increase Connection 1 throughput R 3: Transport Layer 24

TCP Congestion Control History r Before 1988, only flow control! r TCP Tahoe 1988

TCP Congestion Control History r Before 1988, only flow control! r TCP Tahoe 1988 m Congestion control with multiplicative decrease on timeout r TCP Reno 1990 m Add fast recovery and delayed acknowledgements r TCP Vegas ? m Tries to use space in router’s queues fairly not just divide BW fairly 3: Transport Layer 25

TCP Vegas r Tries to use constant space in the router buffers r Compares

TCP Vegas r Tries to use constant space in the router buffers r Compares each round trip time to the minimum round trip time it has seen to infer time spent in queuing delays r Vegas in not a recommended version of TCP m Minimum time may never happen m Can’t compete with Tahoe or Reno 3: Transport Layer 26

TCP latency modeling Q: How long does it take to receive an object from

TCP latency modeling Q: How long does it take to receive an object from a Web server after sending a request? r TCP connection establishment r data transfer delay r Slow start A: That is a natural question, but not very easy to answer. Depends on round trip time, bandwidth, window size (dynamic changes to it) 3: Transport Layer 27

TCP latency modeling Two cases to consider: r Slow Sender (Big Window): Still sending

TCP latency modeling Two cases to consider: r Slow Sender (Big Window): Still sending when ACK returns m m time to send window W*S/R > time to get first ack > RTT + S/R r Fast Sender (Small Window): Wait for ACK to send more data m m time to send window W*S/R < time to get first ack < RTT + S/R Notation, assumptions: r O: object size (bits) r Assume one link between client and server of rate R r Assume: fixed congestion window, W segments r S: MSS (bits) r no retransmissions (no loss, no corruption) 3: Transport Layer 28

TCP latency Modeling Slow Sender (Big Window): latency = 2 RTT + O/R Number

TCP latency Modeling Slow Sender (Big Window): latency = 2 RTT + O/R Number of windows: K : = O/WS Fast Sender (Small Window): latency = 2 RTT + O/R + (K-1)[S/R + RTT - WS/R] (S/R + RTT) – (WS/R) = Time Till Ack Arrives – Time to Transmit Window 3: Transport Layer 29

TCP Latency Modeling: Slow Start r Now suppose window grows according to slow start

TCP Latency Modeling: Slow Start r Now suppose window grows according to slow start (not slow start + congestion avoidance). r Will show that the latency of one object of size O is: where P is the number of times TCP stalls at server waiting for Ack to arrive and open the window: - Q is the number of times the server would stall if the object were of infinite size - maybe 0. - K is the number of windows that cover the object. -S/R is time to transmit one segment - RTT+ S/R is time to get ACK of one segment 3: Transport Layer 30

TCP Latency Modeling: Slow Start (cont. ) Example: O/S = 15 segments K =

TCP Latency Modeling: Slow Start (cont. ) Example: O/S = 15 segments K = 4 windows Q=2 Stall 1 Stall 2 P = min{K-1, Q} = 2 Server stalls P=2 times. 3: Transport Layer 31

TCP Latency Modeling: Slow Start (cont. ) 3: Transport Layer 32

TCP Latency Modeling: Slow Start (cont. ) 3: Transport Layer 32

TCP Performance Limits r Can’t go faster than speed of slowest link between sender

TCP Performance Limits r Can’t go faster than speed of slowest link between sender and receiver r Can’t go faster than receiver. Advertised. Window/Round. Trip. Time r Can’t go faster than 2*RTT r Can’t go faster than memory bandwidth (lost of memory copies in the kernel) 3: Transport Layer 33

Experiment: Compare TCP and UDP performance r Use ttcp (or pcattcp) to compare effective

Experiment: Compare TCP and UDP performance r Use ttcp (or pcattcp) to compare effective BW when transmitting the same size data over TCP and UDP r UDP not limited by overheads from connection setup or flow control or congestion control r Use Ethereal to trace both 3: Transport Layer 34

TCP vs UDP What would happen if UDP used more than TCP? 3: Transport

TCP vs UDP What would happen if UDP used more than TCP? 3: Transport Layer 35

Transport Layer Summary r principles behind transport layer services: multiplexing/demultiplexing m reliable data transfer

Transport Layer Summary r principles behind transport layer services: multiplexing/demultiplexing m reliable data transfer m flow control m congestion control r instantiation and implementation in the Internet m UDP m TCP m Next: r leaving the network “edge” (application transport layer) r into the network “core” 3: Transport Layer 36

Outtakes 3: Transport Layer 37

Outtakes 3: Transport Layer 37

In-order Delivery r Each packet contains a sequence number r TCP layer will not

In-order Delivery r Each packet contains a sequence number r TCP layer will not deliver any packet to the application unless it has already received and delivered all previous messages r Held in receive buffer 3: Transport Layer 38

Sliding Window Protocol r Reliable Delivery - by acknowledgments and retransmission r In-order Delivery

Sliding Window Protocol r Reliable Delivery - by acknowledgments and retransmission r In-order Delivery - by sequence number r Flow Control - by window size r These properites guaranteed end-to-end not per-hop 3: Transport Layer 39

Segment Transmission r Maximum segment size reached m If accumulate MSS worth of data,

Segment Transmission r Maximum segment size reached m If accumulate MSS worth of data, send m MSS usually set to MTU of the directly connected network (minus TCP/IP headers) r Sender explicitly requests m If sender requests a push, send r Periodic timer m If data held for too long, sent 3: Transport Layer 40

r 1) To aid in congestion control, when a packet is dropped the Timeout

r 1) To aid in congestion control, when a packet is dropped the Timeout is set tp double the last Timeout. Suppose a TCP connection, with window size 1, loses every other packet. Those that do arrive have RTT= 1 second. What happens? What happens to Time. Out? Do this for two cases: r r a. After a packet is eventually received, we pick up where we left off, resuming Estimated. RTT initialized to its pretimeout value and Timeout double that as usual. r b. After a packet is eventually received, we resume with Time. Out initialized to the last exponentially backed-off value used for the 3: Transport Layer timeout interval. 41

Case study: ATM ABR congestion control ABR: available bit rate: r “elastic service” r

Case study: ATM ABR congestion control ABR: available bit rate: r “elastic service” r if sender’s path “underloaded”: m sender should use available bandwidth r if sender’s path congested: m sender throttled to minimum guaranteed rate RM (resource management) cells: r sent by sender, interspersed with data cells r bits in RM cell set by switches (“network-assisted”) m NI bit: no increase in rate (mild congestion) m CI bit: congestion indication r RM cells returned to sender by receiver, with bits intact 3: Transport Layer 42

Case study: ATM ABR congestion control r two-byte ER (explicit rate) field in RM

Case study: ATM ABR congestion control r two-byte ER (explicit rate) field in RM cell m congested switch may lower ER value in cell m sender’ send rate thus minimum supportable rate on path r EFCI bit in data cells: set to 1 in congested switch m if data cell preceding RM cell has EFCI set, sender sets CI bit in returned RM cell 3: Transport Layer 43

Sliding Window Protocol r Reliable Delivery - by acknowledgments and retransmission r In-order Delivery

Sliding Window Protocol r Reliable Delivery - by acknowledgments and retransmission r In-order Delivery - by sequence number r Flow Control - by window size r These properites guaranteed end-to-end not per-hop 3: Transport Layer 44

End to End Argument r TCP must guarantee reliability, in-order, flow control end-to-end even

End to End Argument r TCP must guarantee reliability, in-order, flow control end-to-end even if guaranteed for each step along way - why? m Packets may take different paths through network m Packets pass through intermediates that might be misbehaving 3: Transport Layer 45

End-To-End Arguement r A function should not be provided in the lower levels unless

End-To-End Arguement r A function should not be provided in the lower levels unless it can be completely and correctly implemented at that level. r Lower levels may implement functions as performance optimization. CRC on hop to hop basis because detecting and retransmitting a single corrupt packet across one hop avoid retransmitting everything end-to-end 3: Transport Layer 46

TCP vs sliding window on physical, point-to-point link r 1) Unlike physical link, need

TCP vs sliding window on physical, point-to-point link r 1) Unlike physical link, need connection establishment/termination to setup or tear down the logical link r 2) Round-trip times can vary significantly over lifetime of connection due to delay in network so need adaptive retransmission timer r 3) Packets can be reordered in Internet (not possible on point-to-point) 3: Transport Layer 47

TCP vs point-to-point (continues) r 4) establish maximum segment lifetime based on IP time-to-live

TCP vs point-to-point (continues) r 4) establish maximum segment lifetime based on IP time-to-live field conservative estimate of how the TTL field (hops) translates into MSL (time) r 5) On point-to-point link can assume computer on each end have enough buffer space to support the link m TCP must learn buffering on other end 3: Transport Layer 48

TCP vs point-to-point (continued) r 6) no congestion on a point-to-point link - TCP

TCP vs point-to-point (continued) r 6) no congestion on a point-to-point link - TCP fast sender could swamp slow link on route to receiver or multiple senders could swamp a link on path m need congestion control in TCP 3: Transport Layer 49