TCP Details Roadmap r Congestion Control Causes Symptoms

















































- Slides: 49
TCP Details: Roadmap r Congestion Control: Causes, Symptoms, Approaches to Dealing With r Slow Start/ Congestion Avoidance r TCP Fairness r TCP Performance r Transport Layer Wrap-up 3: Transport Layer 1
Principles of Congestion Control Congestion: r informally: “too many sources sending too much data too fast for network to handle” r different from flow control! r a top-10 problem! 3: Transport Layer 2
Congestion Signals r Lost packets: If there are more packets than resources (ex. Buffer space) along some path, then no choice but to drop some r Delayed packets: Router queues get full and packets wait longer for service r Explicit notification: Routers can alter packet headers to notify end hosts 3: Transport Layer 3
Congestion Collapse r As number of packets entering network increases, number of packets arriving at destination increases but only up to a point r Packet dropped in network => all the resources it used along the way are wasted => no forward progress r Internet 1987 3: Transport Layer 4
Congestion Prevention? r In a connection-oriented network: m Prevent congestion by requiring resources to be reserved in advance r In a connectionless network: m No prevention for congestion, just reaction to congestion (congestion control) 3: Transport Layer 5
Causes/costs of congestion: scenario 1 r two senders, two receivers r one router, infinite buffers r no retransmission r large delays when congested r maximum achievable throughput 3: Transport Layer 6
Causes/costs of congestion: scenario 2 r one router, finite buffers r sender retransmission of lost packet 3: Transport Layer 7
Causes/costs of congestion: scenario 2 r l = l out(goodput) in r “perfect” retransmission only when loss: r l > lout in retransmission of delayed (not lost) packet makes l in l (than perfect case) for same out larger “costs” of congestion: r more work (retrans) for given “goodput” r unneeded retransmissions: link carries multiple copies of pkt 3: Transport Layer 8
Causes/costs of congestion: scenario 3 r four senders r multihop paths r timeout/retransmit Q: what happens as l in and l increase ? in 3: Transport Layer 9
Causes/costs of congestion: scenario 3 Another “cost” of congestion: r when packet dropped, any “upstream transmission capacity used for that packet wasted! 3: Transport Layer 10
Approaches towards congestion control Two broad approaches towards congestion control: End-end congestion control: r no explicit feedback from network r congestion inferred from end-system observed loss, delay r approach taken by TCP Network-assisted congestion control: r routers provide feedback to end systems m single bit indicating congestion (SNA, DECbit, TCP/IP ECN, ATM) m explicit rate sender should send at 3: Transport Layer 11
Window Size Revised r Limit window size by both receiver advertised window *and* a “congestion window” m Max. Window < = minimum (Receiver. Advertised Window, Congestion Window) m Effective. Window = Max Window - (Last Byte. Sent - Last. Byte. Acked) 3: Transport Layer 12
TCP Congestion Control r end-end control (no network assistance) r transmission rate limited by congestion window size, Congwin, over segments: Congwin 3: Transport Layer 13
Original: With Just Flow Control Destination … Source 3: Transport Layer 14
TCP Congestion Control: Two Phases r two “phases” m slow start m congestion avoidance r important variables: m Congwin: current congestion window m Threshold: defines threshold between two slow start phase, congestion control phase 3: Transport Layer 15
TCP congestion control: r “probing” for usable bandwidth: m m m ideally: transmit as fast as possible (Congwin as large as possible) without loss increase Congwin until loss (congestion) loss: decrease Congwin, then begin probing (increasing) again r Don’t just send the entire receiver’s advertised window worth of data right away r Start with a congestion window of 1 or 2 packets r Slow start: For each ack received, double window up until a threshold, then just increase by 1 r Congestion Avoidance: For each timeout, start back at 1 and halve the upper threshold 3: Transport Layer 16
“Slow” Start: Multiplicative Increase Source Destination Multiplicative Increase Up to the Threshold “Slower” than full receiver’s advertised window … Faster than additive increase 3: Transport Layer 17
TCP Congestion Avoidance: Additive Increase Source Destination … Additive Increase Past the Threshhold 3: Transport Layer 18
TCP Congestion Avoidance: Multiplicative Decrease too Congestion avoidance /* slowstart is over */ /* Congwin > threshold */ Until (loss event) { every w segments ACKed: Congwin++ } threshold = Congwin/2 Congwin = 1 1 perform slowstart 1: TCP Reno skips slowstart (fast recovery) after three duplicate ACKs 3: Transport Layer 19
Fast Retransmit r Interpret 3 duplicate acks as an early warning of loss (other causes? Reordering or duplication in network) r As if timeout - Retransmit packet and set the slow-start threshold to half the amount of unacked data r Unlike timeout - set congestion window to the threshhold (not back to 1 like normal slow start) 3: Transport Layer 20
Fast Recovery r After a fast retransmit, do congestion avoidance but not slow start. r After third dup ack received: m threshold = ½ (congestion window) m Congestion window = threshold + 2* MSS r If more dup acks arrive: m congestion Window += MSS r When ack arrives for new data, deflate congestion window: m congestion. Window = threshold 3: Transport Layer 21
Connection Timeline r blue line = value of congestion window in KB r Short hash marks = segment transmission r Long hash lines = time when a packet eventually retransmitted was first transmitted r Dot at top of graph = timeout r 0 -0. 4 Slow start; 2. 0 timeout, start back at 1; 2. 0 -4. 0 linear increase 3: Transport Layer 22
AIMD TCP congestion avoidance: r AIMD: additive increase, multiplicative decrease m m increase window by 1 per RTT decrease window by factor of 2 on loss event TCP Fairness goal: if N TCP sessions share same bottleneck link, each should get 1/N of link capacity TCP connection 1 TCP connection 2 bottleneck router capacity R 3: Transport Layer 23
Why is TCP fair? Two competing sessions: r Additive increase gives slope of 1, as throughout increases r multiplicative decreases throughput proportionally equal bandwidth share R Connection 2 throughput r loss: decrease window by factor of 2 congestion avoidance: additive increase Connection 1 throughput R 3: Transport Layer 24
TCP Congestion Control History r Before 1988, only flow control! r TCP Tahoe 1988 m Congestion control with multiplicative decrease on timeout r TCP Reno 1990 m Add fast recovery and delayed acknowledgements r TCP Vegas ? m Tries to use space in router’s queues fairly not just divide BW fairly 3: Transport Layer 25
TCP Vegas r Tries to use constant space in the router buffers r Compares each round trip time to the minimum round trip time it has seen to infer time spent in queuing delays r Vegas in not a recommended version of TCP m Minimum time may never happen m Can’t compete with Tahoe or Reno 3: Transport Layer 26
TCP latency modeling Q: How long does it take to receive an object from a Web server after sending a request? r TCP connection establishment r data transfer delay r Slow start A: That is a natural question, but not very easy to answer. Depends on round trip time, bandwidth, window size (dynamic changes to it) 3: Transport Layer 27
TCP latency modeling Two cases to consider: r Slow Sender (Big Window): Still sending when ACK returns m m time to send window W*S/R > time to get first ack > RTT + S/R r Fast Sender (Small Window): Wait for ACK to send more data m m time to send window W*S/R < time to get first ack < RTT + S/R Notation, assumptions: r O: object size (bits) r Assume one link between client and server of rate R r Assume: fixed congestion window, W segments r S: MSS (bits) r no retransmissions (no loss, no corruption) 3: Transport Layer 28
TCP latency Modeling Slow Sender (Big Window): latency = 2 RTT + O/R Number of windows: K : = O/WS Fast Sender (Small Window): latency = 2 RTT + O/R + (K-1)[S/R + RTT - WS/R] (S/R + RTT) – (WS/R) = Time Till Ack Arrives – Time to Transmit Window 3: Transport Layer 29
TCP Latency Modeling: Slow Start r Now suppose window grows according to slow start (not slow start + congestion avoidance). r Will show that the latency of one object of size O is: where P is the number of times TCP stalls at server waiting for Ack to arrive and open the window: - Q is the number of times the server would stall if the object were of infinite size - maybe 0. - K is the number of windows that cover the object. -S/R is time to transmit one segment - RTT+ S/R is time to get ACK of one segment 3: Transport Layer 30
TCP Latency Modeling: Slow Start (cont. ) Example: O/S = 15 segments K = 4 windows Q=2 Stall 1 Stall 2 P = min{K-1, Q} = 2 Server stalls P=2 times. 3: Transport Layer 31
TCP Latency Modeling: Slow Start (cont. ) 3: Transport Layer 32
TCP Performance Limits r Can’t go faster than speed of slowest link between sender and receiver r Can’t go faster than receiver. Advertised. Window/Round. Trip. Time r Can’t go faster than 2*RTT r Can’t go faster than memory bandwidth (lost of memory copies in the kernel) 3: Transport Layer 33
Experiment: Compare TCP and UDP performance r Use ttcp (or pcattcp) to compare effective BW when transmitting the same size data over TCP and UDP r UDP not limited by overheads from connection setup or flow control or congestion control r Use Ethereal to trace both 3: Transport Layer 34
TCP vs UDP What would happen if UDP used more than TCP? 3: Transport Layer 35
Transport Layer Summary r principles behind transport layer services: multiplexing/demultiplexing m reliable data transfer m flow control m congestion control r instantiation and implementation in the Internet m UDP m TCP m Next: r leaving the network “edge” (application transport layer) r into the network “core” 3: Transport Layer 36
Outtakes 3: Transport Layer 37
In-order Delivery r Each packet contains a sequence number r TCP layer will not deliver any packet to the application unless it has already received and delivered all previous messages r Held in receive buffer 3: Transport Layer 38
Sliding Window Protocol r Reliable Delivery - by acknowledgments and retransmission r In-order Delivery - by sequence number r Flow Control - by window size r These properites guaranteed end-to-end not per-hop 3: Transport Layer 39
Segment Transmission r Maximum segment size reached m If accumulate MSS worth of data, send m MSS usually set to MTU of the directly connected network (minus TCP/IP headers) r Sender explicitly requests m If sender requests a push, send r Periodic timer m If data held for too long, sent 3: Transport Layer 40
r 1) To aid in congestion control, when a packet is dropped the Timeout is set tp double the last Timeout. Suppose a TCP connection, with window size 1, loses every other packet. Those that do arrive have RTT= 1 second. What happens? What happens to Time. Out? Do this for two cases: r r a. After a packet is eventually received, we pick up where we left off, resuming Estimated. RTT initialized to its pretimeout value and Timeout double that as usual. r b. After a packet is eventually received, we resume with Time. Out initialized to the last exponentially backed-off value used for the 3: Transport Layer timeout interval. 41
Case study: ATM ABR congestion control ABR: available bit rate: r “elastic service” r if sender’s path “underloaded”: m sender should use available bandwidth r if sender’s path congested: m sender throttled to minimum guaranteed rate RM (resource management) cells: r sent by sender, interspersed with data cells r bits in RM cell set by switches (“network-assisted”) m NI bit: no increase in rate (mild congestion) m CI bit: congestion indication r RM cells returned to sender by receiver, with bits intact 3: Transport Layer 42
Case study: ATM ABR congestion control r two-byte ER (explicit rate) field in RM cell m congested switch may lower ER value in cell m sender’ send rate thus minimum supportable rate on path r EFCI bit in data cells: set to 1 in congested switch m if data cell preceding RM cell has EFCI set, sender sets CI bit in returned RM cell 3: Transport Layer 43
Sliding Window Protocol r Reliable Delivery - by acknowledgments and retransmission r In-order Delivery - by sequence number r Flow Control - by window size r These properites guaranteed end-to-end not per-hop 3: Transport Layer 44
End to End Argument r TCP must guarantee reliability, in-order, flow control end-to-end even if guaranteed for each step along way - why? m Packets may take different paths through network m Packets pass through intermediates that might be misbehaving 3: Transport Layer 45
End-To-End Arguement r A function should not be provided in the lower levels unless it can be completely and correctly implemented at that level. r Lower levels may implement functions as performance optimization. CRC on hop to hop basis because detecting and retransmitting a single corrupt packet across one hop avoid retransmitting everything end-to-end 3: Transport Layer 46
TCP vs sliding window on physical, point-to-point link r 1) Unlike physical link, need connection establishment/termination to setup or tear down the logical link r 2) Round-trip times can vary significantly over lifetime of connection due to delay in network so need adaptive retransmission timer r 3) Packets can be reordered in Internet (not possible on point-to-point) 3: Transport Layer 47
TCP vs point-to-point (continues) r 4) establish maximum segment lifetime based on IP time-to-live field conservative estimate of how the TTL field (hops) translates into MSL (time) r 5) On point-to-point link can assume computer on each end have enough buffer space to support the link m TCP must learn buffering on other end 3: Transport Layer 48
TCP vs point-to-point (continued) r 6) no congestion on a point-to-point link - TCP fast sender could swamp slow link on route to receiver or multiple senders could swamp a link on path m need congestion control in TCP 3: Transport Layer 49