Advanced Congestion Control Hosts 14 740 Fundamentals of

Congestion Control (2) • Apply some control theory • Region 1: Low throughput •

Feedback Mechanism • End-to-end: Messages from receiver • Network Assisted: Signals from routers 14

Slow Start • When connection begins, increase rate exponentially: • double Cong. Win every

Congestion Avoidance • Additive Increase: Increase Cong. Win by 1 MSS every RTT until

TCP CC States Cong. Win ≥ ssthresh Slow start Cong Avoid timeout eo tim

traceroute • Advanced TCP Congestion Control • Lots of algorithmic variants • Long Fat

TCP Variations • TCP New Reno • TCP Vegas • TCP Hybla • BIC

TCP New Reno • Improves fast recovery retransmissions • RFC 2582 & 3782 •

New Reno: Improve Fast Recovery • In Fast Recovery: • Record highest un. ACKed#

TCP Vegas • First delay-based TCP variant • Look for variations in RTT as

TCP Vegas (2) • Goal: keep a certain amount of data in the queues

TCP Hybla • Goal: improve high-latency / high-error rate links (i. e. satellite) •

The LFN Problem • Long Fat Networks: High-speed and highlatency • Many, many segments

The LFN Problem • How many? • Ex: 10 Gbps, 100 ms RTT, MSS

LFN Approaches • Get lots of segments in flight: • Start really quickly •

Binary Increase Congestion • TCP-BIC uses a binary search to probe for additional bandwidth

Binary Search • Target Cong. Win is halfway between max and min, 2 control

BIC: Fairness • An overly aggressive algorithm will rob bandwidth from normal TCP algorithms

What is Congestion? • Loss-based: Look for 3 dup ACK, timeout • Used in

Compound TCP • A hybrid of loss-based and delay-based algorithms • More aggressively seeks

CTCP Mechanisms • Key idea is to use a normal congestion window, combined with

Delay Window • If network bandwidth is underutilized (based on RTT observations) dwind(t+1) =

Fairness • TCP flows compete for additional bandwidth • If one flow is too

CTCP Fairness • When no congestion sensed, full-speed ahead! • When congestion first sensed

Prisoner’s Dilemma • Game-theory problem with interesting implications for networks • “Classic Form” •

Why Discuss? • TCP restrictions are self-imposed • Nothing in the network checks that

Silly Window Syndrome • A serious problem can arise in the sliding window operation

Silly Window Syndrome There are 3 causes of SWS: • When the server announces

Silly Window Syndrome • When there is no synchronization between the sender and receiver

Nagle’s Algorithm • Interacts badly with TCP delayed acknowledgments. • With both algorithms enabled,

Lesson Objectives • Now, you should be able to: • describe features of the

• You should be able to: • describe the challenges of congestion control

Slides: 34

Download presentation

Advanced Congestion Control (Hosts) 14 -740: Fundamentals of Computer Networks Credit: Bill Nace Material from Computer Networking: A Top Down Approach, 5 th edition. J. F. Kurose and K. W. Ross

Congestion Control (2) • Apply some control theory • Region 1: Low throughput • Region 2: High delay • Throughput increases slowly • Delay increases quickly • Region 3: Collapse • Throughput ➙ 0, Delay ➙ ∞ • At what load would we like to operate?

Feedback Mechanism • End-to-end: Messages from receiver • Network Assisted: Signals from routers 14 -740: Spring 2018 3

Slow Start • When connection begins, increase rate exponentially: • double Cong. Win every RTT • done by incrementing Cong. Win for every ACK received • Summary: initial rate is slow but ramps up exponentially fast

Congestion Avoidance • Additive Increase: Increase Cong. Win by 1 MSS every RTT until loss is detected • Multiplicative Decrease: cut Cong. Win in half after a loss event Sawtooth behavior: Probing for bandwidth 14 -740: Spring 2018 5

TCP CC States Cong. Win ≥ ssthresh Slow start Cong Avoid timeout eo tim ut 3 dup. ACK Fast Recovery ACK 3 dup. ACK

traceroute • Advanced TCP Congestion Control • Lots of algorithmic variants • Long Fat Networks problem • TCP-BIC • Compound TCP • Prisoner’s Dilemma 14 -740: Spring 2018 7

TCP Variations • TCP New Reno • TCP Vegas • TCP Hybla • BIC / CUBIC • Compound TCP • many others exist (SACK, Veno, Westwood, XCP, Ye. AH-TCP, . . . ) 14 -740: Spring 2018 8

TCP New Reno • Improves fast recovery retransmissions • RFC 2582 & 3782 • Good at filling multiple holes, but still probing for higher throughput • Substantially outperforms Reno at high error rates 14 -740: Spring 2018 9

New Reno: Improve Fast Recovery • In Fast Recovery: • Record highest un. ACKed# already sent • Return to cong-avoid when this segment is ACKed • For each Dup ACK, new segment sent • keeps transmit window full • For each good ACK -- hole assumed • retransmit segment just past the ACK seqnum 14 -740: Spring 2018 10

TCP Vegas • First delay-based TCP variant • Look for variations in RTT as indication of queue length at routers (i. e. oncoming congestion) • If lower than expected RTT, send more • If higher, send less (by lowering Cong. Win) • Congestion prevention strategy 14 -740: Spring 2018 11

TCP Vegas (2) • Goal: keep a certain amount of data in the queues of the network • Much smoother flow than Reno • Achieves higher average throughput • But, when competing with Reno, only gets ~2/3 of Reno's bandwidth • Backs off before congestion, while Reno backs off only after congestion Source: Brakmo 94 and La 99. Available on course website 14 -740: Spring 2018 12

TCP Hybla • Goal: improve high-latency / high-error rate links (i. e. satellite) • Much longer RTT • Segments dropped due to bit-error look like congestion • Analytically evaluates Cong. Win dynamics • Rather than measuring RTT • Included in Linux from 2. 6. 13 Source: Caini 2004. Available on course website 14 -740: Spring 2018 13

The LFN Problem • Long Fat Networks: High-speed and highlatency • Many, many segments will be in-flight • How many? • Bandwidth-Delay Product 14 -740: Spring 2018 15

The LFN Problem • How many? • Ex: 10 Gbps, 100 ms RTT, MSS 1500 B • Cong. Win = 83, 333 segments • Needs loss event < every 1 hr 40 min • Else never gets out of Slow Start • i. e. 1 event per 5 billion segments • Bit Error Rate of 2 x 10 -14 ← unrealistic 14 -740: Spring 2018 Example from RFC 3649 written in Dec 2003 16

LFN Approaches • Get lots of segments in flight: • Start really quickly • Can we do better than Slow Start? • Be more aggressive in Cong. Avoid • AIMD approach of adding 1 won’t cut it 14 -740: Spring 2018 17

Binary Increase Congestion • TCP-BIC uses a binary search to probe for additional bandwidth • Default for Linux 2. 6. 8 -2. 6. 18 • Replaced with Cubic, a fairer alternative 14 -740: Spring 2018 18

Binary Search • Target Cong. Win is halfway between max and min, 2 control variables • If Cong. Win grows to target, set min to current, recalculate target • If loss happens, set max to current, min to recovery point, recalculate target

BIC: Fairness • An overly aggressive algorithm will rob bandwidth from normal TCP algorithms • BIC incorporates fairness idea • Binary search means less aggressive when near bandwidth maximums • Also includes a plateau period to allow TCP flows to get out of slow start 14 -740: Spring 2018 20

What is Congestion? • Loss-based: Look for 3 dup ACK, timeout • Used in Reno, HSTCP, TCP-BIC • Delay-based: Look for variations in RTT, estimate queue lengths at routers • Used in TCP Vegas, FAST TCP 14 -740: Spring 2018 21

Compound TCP • A hybrid of loss-based and delay-based algorithms • More aggressively seeks for additional bandwidth when no evidence of congestion • Attempts to be especially fair to other protocols • Used since Microsoft Vista 14 -740: Spring 2018 22

CTCP Mechanisms • Key idea is to use a normal congestion window, combined with a delay-based congestion window Total. Cong. Wind = cwind + dwind • cwind updated normally (AIMD) in Cong. Avoid • No loss cwind = cwind + 1 MSS per RTT* • Loss (timeout, 3 dup ACK) cwind = cwind / 2 14 -740: Spring 2018 *Actually, a small adjustment as Total. Cong. Wind should grow by 1 MSS per RTT 23

Delay Window • If network bandwidth is underutilized (based on RTT observations) dwind(t+1) = dwind(t) + α dwind(t)k • If some queueing happening dwind(t+1) = dwind(t)-queue length* • If there is a loss dwind(t+1) = (1 - β) dwind(t) • α, β, k are tuning parameters for scalability, smoothness and responsiveness 14 -740: Spring 2018 *Yes, there’s a complicated way of predicting what the queue lengths might be. Let’s skip it. . . 24

Fairness • TCP flows compete for additional bandwidth • If one flow is too aggressive, it will cause segment loss in other flows (and perhaps itself) • Segment loss will cause other flows to retreat • Which may provide additional bandwidth to the aggressor • Especially problematic with delay-based • Sense congestion earlier than a loss 14 -740: Spring 2018 25

CTCP Fairness • When no congestion sensed, full-speed ahead! • When congestion first sensed (via delay measurements) stop seeking more BW • When loss occurs, back off normally 14 -740: Spring 2018 26

traceroute • Advanced TCP Congestion Control • Lots of algorithmic variants • Long Fat Networks problem • TCP-BIC • Compound TCP • Prisoner’s Dilemma 14 -740: Spring 2018 27

Prisoner’s Dilemma • Game-theory problem with interesting implications for networks • “Classic Form” • 2 conspirators arrested • Separately offered a “deal” • Each must choose to betray or remain silent B stays silent A betrays B betrays 6 months each A: 10 years B: Goes free A: Goes free B: 10 years 5 years each

Why Discuss? • TCP restrictions are self-imposed • Nothing in the network checks that sender is actually following the algorithms • Bad behavior can have short-term advantages • Examples?

Silly Window Syndrome • A serious problem can arise in the sliding window operation when the sending application program creates data slowly, the receiving application program consumes data slowly, or both. • If a server with this problem is unable to process all incoming data, it requests that its clients reduce the amount of data they send at a time (the window setting on a TCP packet). • If the server continues to be unable to process all incoming data, the window becomes smaller and smaller, sometimes to the point that the data transmitted is smaller than the packet header, making data transmission extremely inefficient. • • The name of this problem is due to the window size shrinking to a "silly" value. Since there is a certain amount of overhead associated with processing each packet, the increased number of packets means increased overhead to process a decreasing amount of data. The end result is thrashing. -- https: //en. wikipedia. org/wiki/Silly_window_syndrome 14 -740: Spring 2018 30

Silly Window Syndrome There are 3 causes of SWS: • When the server announces empty space as 0 • When client is able to generate only 1 byte at a time • When server is able to consume only 1 byte at a time • During SWS, efficiency of communication is almost 0, so SWS duration should be short as possible -- https: //en. wikipedia. org/wiki/Silly_window_syndrome 14 -740: Spring 2018

Silly Window Syndrome • When there is no synchronization between the sender and receiver regarding capacity of the flow of data or the size of the packet, the window syndrome problem is created. • When the silly window syndrome is created by the sender, Nagle's algorithm is used. • • Nagle's solution requires that the sender send the first segment even if it is a small one, then that it wait until an ACK is received or a maximum sized segment (MSS) is accumulated. When the silly window syndrome is created by the receiver, David D Clark's solution is used. • Clark's solution closes the window until another segment of maximum segment size (MSS) can be received or the buffer is half empty. -- https: //en. wikipedia. org/wiki/Silly_window_syndrome 14 -740: Spring 2018

Nagle’s Algorithm • Interacts badly with TCP delayed acknowledgments. • With both algorithms enabled, applications that do two successive writes to a TCP connection, followed by a read that will not be fulfilled until after the data from the second write has reached the destination, experience a constant delay of up to 500 milliseconds, the "ACK delay". • For this reason, TCP implementations usually provide applications with an interface to disable the Nagle algorithm. This is typically called the TCP_NODELAY option. • A solution recommended by Nagle is to avoid the algorithm sending premature packets by buffering up application writes and then flushing the buffer. • The user-level solution is to avoid write-read sequences on sockets. write-read is fine. write-write is fine. But write-read is a killer. So, if you can, buffer up your little writes to TCP and send them all at once. Using the standard UNIX I/O package and flushing write before each read usually works. -- https: //en. wikipedia. org/wiki/Nagle%27 s_algorithm 14 -740: Spring 2018

Lesson Objectives • Now, you should be able to: • describe features of the following TCP congestion control variations: New Reno, Vegas, Hybla, BIC and Compound TCP • describe the advantages and disadvantages of delay-based variants 14 -740: Spring 2018 34

• You should be able to: • describe the challenges of congestion control for LFNs • describe the problems and attractions of a non-cooperative TCP implementation • Describe Silly Window Syndrome and how TCP addresses it.