Part 2 Transport Layer TCP Flow Control Congestion

  • Slides: 65
Download presentation
Part 2 Transport Layer TCP Flow Control, Congestion Control, Connection Management, etc. 1 Transport

Part 2 Transport Layer TCP Flow Control, Congestion Control, Connection Management, etc. 1 Transport Layer – TCP B

Encapsulation in TCP/IP IP datagram 2 Transport Layer – TCP B

Encapsulation in TCP/IP IP datagram 2 Transport Layer – TCP B

TCP: Overview Error detection, retransmission, cumulative ACKs, timers, header fields for sequence and ACK

TCP: Overview Error detection, retransmission, cumulative ACKs, timers, header fields for sequence and ACK numbers point-to-point: one sender, one receiver reliable, in-order byte stream: no message boundaries pipelined: TCP congestion and flow control set window size send & receive buffers socket door application writes data application reads data TCP send buffer TCP receive buffer segment full duplex data: bi-directional app. data flow in same connection MSS: MSS maximum segment size connection-oriented: handshaking (exchange of control msgs) init's sender, receiver state before data exchange flow controlled: socket sender will not ''flood'' door receiver with data 3 Transport Layer – TCP B

Recall socket door application writes data application reads data TCP send buffer TCP receive

Recall socket door application writes data application reads data TCP send buffer TCP receive buffer socket door Packet -> Reliable Data Transfer Mechanisms: Checksum Timer - Verification of integrity of packet - Signals necessary re-transmission is required Sequence number - Keeps track of which packet has been sent and received ACK - Indicates receipt of packet in good or bad form NAK Window, pipelining - Allows for the sending of multiple yet-to-be-acknowledged packets Transport Layer – TCP 4 B

Internet Checksum Example Note When adding numbers, a carryout from the most significant bit

Internet Checksum Example Note When adding numbers, a carryout from the most significant bit needs to be added to the result Example: add two 16 -bit integers data 1 1 0 0 1 1 1 0 1 0 1 wraparound 1 1 1 0 1 1 sum 1 1 0 1 1 0 0 checksum 1 0 0 0 0 1 1 To check: 1 1 1 Transport 1 1 1 Layer 1 – TCP 5 B

Connection Oriented Transport: TCP Segment Structure SEQ and ACK numbers Calculating the Timeout Interval

Connection Oriented Transport: TCP Segment Structure SEQ and ACK numbers Calculating the Timeout Interval The Simplified TCP Sender ACK Generation Recommendation (RFC 1122, RFC 2581) Interesting Transmission Scenarios Flow Control TCP Connection Management 6 Transport Layer – TCP B

TCP segment structure 32 bits URG: urgent data (generally not used) ACK: ACK #

TCP segment structure 32 bits URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now (generally not used) RST, SYN, FIN: connection established (setup, tear down commands) Internet checksum (as in UDP) source port # dest. port # sequence number acknowledgement number Header rcvr window size head not UA P R S F len used checksum URGent data ptr Options (variable length) application data (variable length) In practice, PSH, URG, and the Urgent Data Pointer are not used. counting by bytes of data (not segments!) # bytes the rcvr is willing to accept We can view these teeny-weeny details using Ethereal. 7 Transport Layer – TCP B

Example Suppose that a process in Host A wants to send a stream of

Example Suppose that a process in Host A wants to send a stream of data to a process in Host B over a TCP connection. Assume: Data stream: file consisting of 500, 000 bytes MSS: 1, 000 bytes First byte of data stream: numbered as 0 TCP constructs 500 segments out of the data stream. 500, 000 bytes/1, 000 bytes = 500 segments 8 Transport Layer – TCP B

TCP sequence #'s and ACKs Segment 1 0 1 2 3 4 . .

TCP sequence #'s and ACKs Segment 1 0 1 2 3 4 . . . 999 . . . Segment 2 1000 1001 1002. . 1999 Sequence. Numbers (#'s): byte stream 'number' of first byte in segment's data Do not necessarily start from 0, use random initial number R • Segment 1: 0 + R • Segment 2: 1000 + R etc. . . ACKs (acknowledgment): Seq # of next byte expected from other side (last byte +1) Cumulative ACK If received segment 1, waits for segment 2 E. g. Ack=1000 + R (received up to 999 th byte) 9 Transport Layer – TCP B

TCP sequence #'s and ACKs simple telnet scenario (with echo on) client Host B

TCP sequence #'s and ACKs simple telnet scenario (with echo on) client Host B Host A server Q: how receiver handles User Seq=4 out-of-order segments 2, AC K=79, types data = ‘C’ A: TCP specs. does 'C' host ACKs not say, - decide I’m sending data starting at seq. num=42 receipt of when implementing C’ 'C', echoes ta = ‘ Assuming that the starting sequence numbers for Host A and Host B are: 42 and 79 respectively K=4 9, AC 3, da 7 Seq= host ACKs receipt of echoed 'C' Seq=4 3 back 'C' Send me the bytes from 43 onward ACK is being piggy-backed on server-to-client data , ACK= 80 time 10 Transport Layer – TCP B

Yet another server echo example Host A: seq=42 ack=79 seq=47 ack=84 Host B Host

Yet another server echo example Host A: seq=42 ack=79 seq=47 ack=84 Host B Host A User types 'Hello' Seq=4 2, AC K=79, data = ‘Hello ’ host ACKs receipt of ’ o l l e H 'Hello', ta = ‘ a d , 7 echoes back CK=4 A , 9 7 = Seq 'Hello' host ACKs receipt Seq=4 7, ACK of echoed =84, d ata = ‘ 200’ 'Hello' send something ‘ 200’ else = a at , K=50 C A 84, Seq= d Host B: seq=79 ack=47 seq=84 ack=50 time ACK tells about up to what byte has been received and what is the next Transport Layer – TCP starting byte the host is expecting to receive 11 B

TCP Round Trip Time and Timeout Main Issue: Issue How long is the sender

TCP Round Trip Time and Timeout Main Issue: Issue How long is the sender willing to wait before re-transmitting the packet? Q: how to set TCP timeout value? longer than RTT * note: RTT will vary too short: premature timeout unnecessary retransmissions too long: slow reaction to segment loss * RTT = round trip time Q: how to estimate RTT? Sample. RTT: measured time from segment transmission until ACK receipt ignore retransmissions, cumulatively ACKed segments Sample. RTT will vary, we would want estimated RTT to be ''smoother'' use several recent measurements, measurements not just current Sample. RTT 12 Transport Layer – TCP B

TCP Round Trip Time and Timeout Estimated. RTT = (1 -x) * Estimated. RTT

TCP Round Trip Time and Timeout Estimated. RTT = (1 -x) * Estimated. RTT + x * Sample. RTT Exponential weighted moving average influence of given sample decreases exponentially fast typical value of x: 0. 125 (RFC 2988) Setting the timeout Estimated. RTT plus ''safety margin'' margin large variation in Estimated. RTT -> larger safety margin recommended value of x: 0. 25 Deviation = (1 -x) * Deviation + x * |Sample. RTT-Estimated. RTT| Timeout = Estimated. RTT + (4 * Deviation) 13 Transport Layer – TCP B

Sample Calculations Estimated. RTT = 0. 875 * Estimated. RTT + 0. 125 *

Sample Calculations Estimated. RTT = 0. 875 * Estimated. RTT + 0. 125 * Sample. RTT Estimated. RTT after the receipt of the ACK of segment 1: Estimated. RTT = RTT for Segment 1 = 0. 02746 second Estimated. RTT after the receipt of the ACK of segment 2: Estimated. RTT = 0. 875 * 0. 02746 + 0. 125 * 0. 035557 = 0. 0285 Estimated. RTT after the receipt of the ACK of segment 3: Estimated. RTT = 0. 875 * 0. 0285 + 0. 125 * 0. 070059 = 0. 0337 Estimated. RTT after the receipt of the ACK of segment 4: Estimated. RTT = 0. 875 * 0. 0337+ 0. 125 * 0. 11443 = 0. 0438 Estimated. RTT after the receipt of the ACK of segment 5: Estimated. RTT = 0. 875 * 0. 0438 + 0. 125 * 0. 13989 = 0. 0558 Estimated. RTT after the receipt of the ACK of segment 6: Estimated. RTT = 0. 875 * 0. 0558 + 0. 125 * 0. 18964 = 0. 0725 14 Transport Layer – TCP B

RTT Samples and RTT estimates 300 Estimated RTT Sample RTT 250 RTT (msec. )

RTT Samples and RTT estimates 300 Estimated RTT Sample RTT 250 RTT (msec. ) 200 150 100 time The variations in the Sample. RTT are smoothed out in the computation of the 15 Estimated. RTT. Transport Layer – TCP B

An Actual RTT estimation: 16 Transport Layer – TCP B

An Actual RTT estimation: 16 Transport Layer – TCP B

FSM of TCP for Reliable Data Transfer event: data received from application above create,

FSM of TCP for Reliable Data Transfer event: data received from application above create, send segment wait for event Simplified TCP sender, assuming: - one way data transfer - no flow, congestion control event: timer timeout for segment with seq. number y retransmit segment event: ACK received, with ACK number y process ACK 17 Transport Layer – TCP B

SIMPLIFIED TCP SENDER Assumptions: • sender is not constrained by TCP flow or congestion

SIMPLIFIED TCP SENDER Assumptions: • sender is not constrained by TCP flow or congestion control • that data from above is less than MSS in size • that data transfer is in one direction only Associated with the oldest un. ACKed segment 00 sendbase = initial_sequence number 01 nextseqnum = initial_sequence number 02 03 loop (forever) { 04 switch(event) 05 event: data received from application above 06 create TCP segment with sequence number nextseqnum 07 If (timer is currently not running) start timer for segment nextseqnum 08 pass segment to IP 09 nextseqnum = nextseqnum + length(data) 10 event: timer timeout 11 retransmit not-yet-ACKed segment with smallest Seq. # 12 Start timer 13 event: ACK received, with ACK field value of y 15 if (y > sendbase) { /* cumulative ACK of all data up to y */ 16 sendbase = y 17 If (there are currently any not-yet-ACKed segments) 18 start timer 19 } 18 20 } /* end of loop forever */ Transport Layer – TCP B

TCP with MODIFICATIONS Why wait for the timeout to expire, when consecutive ACKs can

TCP with MODIFICATIONS Why wait for the timeout to expire, when consecutive ACKs can be used to indicate a lost segment With Fast Retransmit SENDER 00 sendbase = initial_sequence number 01 nextseqnum = initial_sequence number 02 03 loop (forever) { 04 switch(event) 05 event: data received from application above 06 create TCP segment with sequence number nextseqnum 07 start timer for segment nextseqnum 08 pass segment to IP 09 nextseqnum = nextseqnum + length(data) 10 event: timer timeout for segment with sequence number y 11 retransmit segment with sequence number y 12 compute new timeout interval for segment y 13 restart timer for sequence number y 14 event: ACK received, with ACK field value of y 15 if (y > sendbase) { /* cumulative ACK of all data up to y */ 16 cancel all timers for segments with sequence numbers < y 17 sendbase = y 18 } 19 else { /* a duplicate ACK for already ACKed segment */ 20 increment number of duplicate ACKs received for y 21 if (number of duplicate ACKS received for y is 3) { 22 /* perform TCP fast retransmission */ 23 resend segment with sequence number y 24 restart timer for segment y 25 } 26 } /* end of loop forever */ 20 Transport Layer – TCP B

TCP ACK generation [RFC 1122, RFC 2581] Receiver does not discard out-of-order segments Event

TCP ACK generation [RFC 1122, RFC 2581] Receiver does not discard out-of-order segments Event TCP Receiver action 1 in-order segment arrival, no gaps, everything else already ACKed Delay sending the ACK. Wait up to 500 ms for next segment. If next segment does not arrive in this interval, send ACK 2 in-order segment arrival, no gaps, one delayed ACK pending (due to action 1) immediately send a single cumulative ACK 3 out-of-order segment arrival with higher than expect seq. # - a gap is detected send duplicate ACK, indicating seq. # of next expected byte 4 arrival of segment that partially or completely fills gap Immediately send an ACK if segment starts at lower end of gap 21 Transport Layer – TCP B

TCP: Interesting Scenarios Simplified TCP version Host A Seq=9 ytes d ata =100 ACK

TCP: Interesting Scenarios Simplified TCP version Host A Seq=9 ytes d ata =100 ACK X loss Seq=9 2 , 8 by Host B Seq=92 timeout 2, 8 b tes da ta =100 ACK time Host A Host B lost ACK scenario Retransmission due to lost ACK 2, 8 b Seq= 100, 2 ytes d 0 byte s data 0 10 = K 120 AC ACK= Seq=9 2, 8 b Timer is restarted here for Seq=92 ata ytes d ata 0 =12 K AC time premature timeout, cumulative ACKs Segment with Seq=100 not 22 Transport Layer – TCP retransmitted B

TCP: Retransmission Scenario Host A Host B Seq=9 2, 8 b Seq=92 timeout 100,

TCP: Retransmission Scenario Host A Host B Seq=9 2, 8 b Seq=92 timeout 100, 2 ytes d ata 0 byte s data 0 =10 K C A X loss 0 =12 K C A time Cumulative ACK avoids retransmission of the first segment. 23 Transport Layer – TCP B

TCP Modifications: Doubling the Timeout Interval Provides a limited form of congestion control Timer

TCP Modifications: Doubling the Timeout Interval Provides a limited form of congestion control Timer expiration is more likely caused by congestion in the network Congestion may get worse if sources continue to retransmit packets persistently. Timeout. Interval = 2 * Timeout. Interval. Previous TCP acts more politely by increasing the Timeout. Interval, causing the sender to retransmit after longer and longer intervals. Others: check RFC 2018 – selective ACK After ACK is received, Timeout. Interval is derived from most recent Estimated. RTT and Dev. RTT 24 Transport Layer – TCP B

TCP Flow Control flow control sender won't overrun receiver's buffer by transmitting too much,

TCP Flow Control flow control sender won't overrun receiver's buffer by transmitting too much, too fast Rcv. Buffer = size of TCP Receive Buffer Rcv. Window = amount of spare room in Buffer receiver buffering receiver: receiver explicitly informs sender of (dynamically changing) amount of free buffer space Rcv. Window field in TCP segment sender: sender keeps the amount of transmitted, un. ACKed data less than most recently received Rcv. Window 25 Transport Layer – TCP B

FLOW CONTROL: Receiver EXAMPLE: HOST A sends a large file to HOST B RECEIVER:

FLOW CONTROL: Receiver EXAMPLE: HOST A sends a large file to HOST B RECEIVER: HOST B – uses Rcv. Window, Last. Byte. Rcvd, Last. Byte. Read Data from IP Application Process 100 60 50 40 0 Last. Byte. Rcvd HOST B Rcv. Window tells HOST A=how much spare room Application it has in the reads connection buffer by Initially, Rcv. Buffer from the buffer placing its current value of Rcv. Window in the receive window field of every segment it sends to HOST A. A Rcv. Buffer Rcv. Window=Rcv. Buffer-[Last. Byte. Rcvd-Last. Byte. Read] 26 Transport Layer – TCP B

FLOW CONTROL: Sender EXAMPLE: HOST A sends a large file to HOST B SENDER:

FLOW CONTROL: Sender EXAMPLE: HOST A sends a large file to HOST B SENDER: HOST A – uses Rcv. Window of Host. B, Last. Byte. Sent, Last. Byte. ACKed SENDER: HOST A ACKs from Host B 100 60 50 40 0 Last. Byte. Sent Data To ensure that HOST B does not overflow, HOST A maintains throughout the 27 Transport Layer – TCP connection’s life that [Last. Byte. Sent-Last. Byte. ACKed] <= Rcv. Window B

FLOW CONTROL Some issue to consider: Rcv. Window – used by the connection to

FLOW CONTROL Some issue to consider: Rcv. Window – used by the connection to provide the flow control service What happens when the receive buffer of HOST B is full ? (that is, when Rcv. Window=0) TCP sends a segment only when there is data or ACK to send. Therefore, the sender must maintain the connection ‘alive’. alive TCP requires that HOST A continue to send segments with one data byte when HOST B’s receive window is 0. Such segments will be ACKed by HOST B. Eventually, the buffer will have some space and the ACKs will contain Rcv. Window > 0 28 Transport Layer – TCP B

TCP Connection Management Recall: TCP sender, receiver establish “connection” connection before exchanging data segments

TCP Connection Management Recall: TCP sender, receiver establish “connection” connection before exchanging data segments Initialize TCP variables: sequence numbers buffers, flow control info (e. g. Rcv. Window) Client is the connection initiator if (connect(s, connect (struct sockaddr *)&sin, sizeof(sin)) != 0) { printf("connect failedn"); WSACleanup(); exit(1); } In Java, Socket client. Socket = new Socket("hostname", "port number"); connect; Server is contacted by client ns = accept(s, (struct sockaddr *)(&remoteaddr), &addrlen); accept In Java, Socket accept(); 29 Transport Layer – TCP B

TCP Connection Management Establishing a connection Client Three way handshake: Server Conne ct (SY

TCP Connection Management Establishing a connection Client Three way handshake: Server Conne ct (SY N =1 , se q=clie nt_isn This is what ) 1) happens. YNwhen =1, t_isn+we n pt (S k=clie e c create ac socket for Ac isn, a _ rver e s = seq connection to a seq= ACK server c ( lient_ SYN= , ack= 0, serve r_isn +1 isn+1 ) time Step 1: client end system sends TCP SYN control segment to server (executed by TCP itself) specifies initial seq number (isn) Step 2: server end system receives SYN, SYN replies with SYNACK control segment ACKs received SYN allocates buffers specifies server’s initial seq. number Step 3: client ACKs the connection with ACK=server_isn +1 allocates buffers sends SYN=0 Connection established! 30 After establishing the connection, the client can receive segments with app-generated data! (SYN=0) Transport Layer – TCP B

TCP Connection Management (cont. ) How TCP connection is established and torn down Closing

TCP Connection Management (cont. ) How TCP connection is established and torn down Closing a connection: client closes socket: client close server FIN closesocket(s); ACK Java: client. Socket. close(); FIN Step 2: server receives FIN, replies with ACK. Closes connection, sends FIN. timed wait Step 1: client end system sends TCP FIN control segment to server close ACK closed 31 Transport Layer – TCP B

TCP Connection Management (cont. ) Step 3: client receives FIN, replies with ACK. client

TCP Connection Management (cont. ) Step 3: client receives FIN, replies with ACK. client closing Enters ''timed wait'' will respond with ACK to received FINs server FIN ACK Step 4: server, receives closing FIN Note: with small modification, can handle simultaneous FINs. timed wait ACK. Connection closed. ACK closed 32 Transport Layer – TCP B

TCP Connection Management (cont) 12 10 2 8 4 Used in case ACK gets

TCP Connection Management (cont) 12 10 2 8 4 Used in case ACK gets lost. It is implementation-dependent (e. g. 30 seconds, 1 minute, 2 minutes TCP server lifecycle 6 11 TCP client lifecycle Connection formally closes – all resources (e. g. port numbers) are released 9 1 7 3 5 33 Transport Layer – TCP B

End of Flow Control and Error Control 34 Transport Layer – TCP B

End of Flow Control and Error Control 34 Transport Layer – TCP B

Flow Control vs. Congestion Control Similar actions are taken, but for very different reasons

Flow Control vs. Congestion Control Similar actions are taken, but for very different reasons Flow Control • point-to-point traffic between sender and receiver • speed matching service, matching the rate at which the sender is sending against the rate at which the receiving application is reading • prevents Receiver Buffer from overflowing Congestion – happens when there are too many sources attempting to send data at too high a rate for the routers along the path Congestion Control • service that makes sure that the routers between End Systems are able to carry the offered traffic • prevents routers from overflowing Same course of action: Throttling of the sender 35 Transport Layer – TCP B

Principles of Congestion Control Congestion: Informally: ''too many sources sending too much data too

Principles of Congestion Control Congestion: Informally: ''too many sources sending too much data too fast for network to handle'' different from flow control! Manifestations: lost packets (buffer overflow at routers) long delays (queuing in router buffers) a top-10 problem! 36 Transport Layer – TCP B

Approaches towards congestion control Two broad approaches towards congestion control: 1 End-to-end congestion control:

Approaches towards congestion control Two broad approaches towards congestion control: 1 End-to-end congestion control: no explicit feedback from network congestion inferred by endsystems from observed packet loss & delay approach taken by TCP 2 Network-assisted congestion control: routers provide feedback to End Systems in the form of: single bit indicating link congestion (SNA, DECbit, TCP/IP ECN, ATM ABR) explicit transmission rate the sender should send at 37 Transport Layer – TCP B

TCP Congestion Control How TCP sender limits the rate at which it sends traffic

TCP Congestion Control How TCP sender limits the rate at which it sends traffic into its connection? SENDER: New variable! – Congestion Window (Amount of un. ACKed data)SENDER < min(Cong. Win, Rcv. Window) Last. Byte. Sent - Last. Byte. ACKed Assumptions: By adjusting Indirectly limits the sender’s sender rate Cong. Win, can therefore adjust the • TCP receive buffer is very large – no Rcv. Window constraint rateisatsolely which it sends Amt. of un. ACKed data at sender limited by Cong. Win • Packet loss delay & packet transmissiondata delayinto areits negligible connection Sending rate: (approx. ) Cong. Win 38 Transport Layer – TCP RTT B

TCP Congestion Control TCP uses ACKs to trigger (“clock”) its increase in congestion window

TCP Congestion Control TCP uses ACKs to trigger (“clock”) its increase in congestion window size – “self-clocking” Arrival of ACKs – indication to the sender that all is well 1. Slow Rate • Congestion window will be increased at a relatively slow rate 2. High rate • Congestion window will be increased more quickly 39 Transport Layer – TCP B

TCP Congestion Control How TCP perceives that there is congestion on the path? “Loss

TCP Congestion Control How TCP perceives that there is congestion on the path? “Loss Event” – when there is excessive congestion, router buffers along the path overflows, causing datagrams to be dropped, which in turn, results in a “loss event” at the sender 1. Timeout • no ACK is received after segment loss 2. Receipt of three duplicate ACKs • segment loss is followed by three ACKs received at the sender 40 Transport Layer – TCP B

TCP Congestion Control: details sender limits transmission: Last. Byte. Sent-Last. Byte. Acked cwnd roughly,

TCP Congestion Control: details sender limits transmission: Last. Byte. Sent-Last. Byte. Acked cwnd roughly, rate = cwnd RTT Bytes/sec cwnd is dynamic, function of perceived network congestion How does sender perceive congestion? loss event = timeout or 3 duplicate acks TCP sender reduces rate (cwnd) after loss event Three mechanisms: 1. 2. 3. AIMD slow start conservative after timeout events 41 Transport Layer – TCP B

TCP congestion avoidance : additive increase, multiplicative decrease approach: increase transmission rate (window size),

TCP congestion avoidance : additive increase, multiplicative decrease approach: increase transmission rate (window size), probing for usable bandwidth, until loss occurs § additive increase: increase cwnd by 1 MSS every RTT until loss is detected § multiplicative decrease: cut cwnd in half after loss saw tooth behavior: probing for bandwidth cwnd: congestion window size v time 42 Transport Layer – TCP B

TCP Slow Start when connection begins, initially cwnd = 1 MSS double cwnd every

TCP Slow Start when connection begins, initially cwnd = 1 MSS double cwnd every RTT done by incrementing cwnd by 1 MSS for every ACK received Host A RTT increase rate exponentially until first loss event: Host B one segme nt two segme nts four segme nts summary: initial rate is slow but ramps up exponentially fast (doubling of the sending rate every RTT) time 43 Transport Layer – TCP B

Refinement: inferring loss after 3 dup ACKs: is cut in half window then grows

Refinement: inferring loss after 3 dup ACKs: is cut in half window then grows linearly but after timeout event: cwnd is set to 1 MSS window then grows exponentially Up to a threshold, threshold then grows linearly cwnd Philosophy: 3 dup ACKs indicates network capable of delivering some segments v timeout indicates a “more alarming” congestion scenario v 44 Transport Layer – TCP B

Refinement Q: when should the exponential increase switch to linear? A: when cwnd gets

Refinement Q: when should the exponential increase switch to linear? A: when cwnd gets to 1/2 of its value before timeout. Implementation: variable ssthresh (slow-start threshold) on loss event, event ssthresh is set to 1/2 of cwnd just before loss event 45 Transport Layer – TCP B

TCP Sender Congestion Control STATE EVENT TCP SENDER Congestion. Control Action Commentary SLOW START

TCP Sender Congestion Control STATE EVENT TCP SENDER Congestion. Control Action Commentary SLOW START (SS) ACK receipt for previously un. ACKed data Cong. Win = Cong. Win + MSS, If(Cong. Win > Threshold) set state to “Congestion Avoidance” Resulting in a doubling of Cong. Win every RTT Congestion ACK receipt for Avoidance previously (CA) un. ACKed data Cong. Win = Cong. Win + MSS * (MSS/Cong. Win) Additive increase, resulting in increasing of Cong. Win by 1 MSS every RTT SS or CA Loss event detected by triple duplicate ACK Threshold = Cong. Win / 2, Cong. Win = Threshold, Set state to “Congestion Avoidance” Fast recovery, implementing multiplicative decrease, Cong. Win will not drop below 1 MSS. SS or CA Timeout Threshold = Cong. Win / 2, Cong. Win = 1 MSS, Set state to “Slow Start” Enter Slow Start. SS or CA Duplicate ACK Increment duplicate ACK count for segment being ACKed Cong. Win and Threshold not 46 changed Transport Layer – TCP B

Summary: TCP Congestion Control duplicate ACK dup. ACKcount++ L cwnd = 1 MSS ssthresh

Summary: TCP Congestion Control duplicate ACK dup. ACKcount++ L cwnd = 1 MSS ssthresh = 64 KB dup. ACKcount = 0 slow start timeout ssthresh = cwnd/2 cwnd = 1 MSS dup. ACKcount = 0 retransmit missing segment dup. ACKcount == 3 ssthresh= cwnd/2 cwnd = ssthresh + 3 retransmit missing segment New ACK! new ACK cwnd = cwnd+MSS dup. ACKcount = 0 transmit new segment(s), as allowed cwnd > ssthresh L timeout ssthresh = cwnd/2 cwnd = 1 MSS dup. ACKcount = 0 retransmit missing segment timeout ssthresh = cwnd/2 cwnd = 1 dup. ACKcount = 0 retransmit missing segment . New ACK! new ACK cwnd = cwnd + MSS (MSS/cwnd) dup. ACKcount = 0 transmit new segment(s), as allowed congestion avoidance duplicate ACK dup. ACKcount++ New ACK! New ACK cwnd = ssthresh dup. ACKcount = 0 fast recovery dup. ACKcount == 3 ssthresh= cwnd/2 cwnd = ssthresh + 3 MSS retransmit missing segment duplicate ACK cwnd = cwnd + MSS transmit new segment(s), as allowed 47 Transport Layer – TCP B

TCP’s Congestion Control Service Problem: Gridlock sets-in when there is packet loss due to

TCP’s Congestion Control Service Problem: Gridlock sets-in when there is packet loss due to router congestion CLIENT The sending system’s packet is lost due to congestion, and is alerted when it stops receiving ACKs of packets sent SERVER Congestion control forces the End Systems to decrease the rate at which packets are sent during periods of congestion 48 Transport Layer – TCP B

Macroscopic Description of TCP throughput (Based on Idealised model for the steady-state dynamics of

Macroscopic Description of TCP throughput (Based on Idealised model for the steady-state dynamics of TCP) what’s the average throughout of TCP as a function of window size and RTT? ignore slow start (typically very short phases) let W be the window size when loss occurs. when window is W, throughput is W/RTT just after loss, window drops to W/2, throughput to W/2 RTT. Throughput increases linearly (by MSS/RTT every RTT) Average Throughput: . 75 W/RTT Transport Layer – 3 TCP Transport Layer 49 49 B

TCP Futures: Futures TCP over “long, fat pipes” Example: GRID computing application v 1500

TCP Futures: Futures TCP over “long, fat pipes” Example: GRID computing application v 1500 -byte segments, 100 ms RTT, desired throughput of 10 Gbps v requires window size W = 83, 333 in-flight segments v Throughput in terms of loss rate: v ➜ L = 2·10 -10 – a very small loss rate! (1 loss event every 5 billion segments) v new versions of TCP is needed for high. Transport Layer speed environments v 3 -50

TCP Fairness goal: if N TCP sessions share same bottleneck link, each should get

TCP Fairness goal: if N TCP sessions share same bottleneck link, each should get an average transmission rate of R/N , an equal share of the link’s bandwidth TCP connection 1 TCP connection 2 bottleneck router capacity R Go to Summary of TCP Congestion Control 51 Transport Layer – TCP B

Analysis of 2 connections sharing a link Assumptions: Link with transmission rate of R

Analysis of 2 connections sharing a link Assumptions: Link with transmission rate of R Each connection have the same MSS, MSS RTT No other TCP connections or UDP datagrams traverse the shared link Ignore slow start phase of TCP Operating in congestion-avoidance mode (linear increase phase) Goal: adjust sending rate of the two connections to allow for equal bandwidth sharing 52 Transport Layer – TCP B

Why is TCP fair? Two competing sessions: Additive increase gives slope of 1, as

Why is TCP fair? Two competing sessions: Additive increase gives slope of 1, as throughout increases multiplicative decrease: decreases throughput proportionally A point on the graph depicts the amount of link bandwidth jointly consumed by the connections equal bandwidth share Connection 2 throughput R We can view a simulation on this loss: decrease window by factor of 2 congestion avoidance: additive increase Connection 1 throughput View Simulation R Full bandwidth utilisation line 53 Transport Layer – TCP B

The End The next succeeding slides are just for additional reading. 54 Transport Layer

The End The next succeeding slides are just for additional reading. 54 Transport Layer – TCP B

TCP Latency Modeling Multiple End Systems sharing a link 1 TCP connection 3 TCP

TCP Latency Modeling Multiple End Systems sharing a link 1 TCP connection 3 TCP connections Multithreading implementation R bps – link’s transmission rate Loop holes in TCP: In practice, client/server applications with smaller RTT gets the available bandwidth more quickly as it becomes free. Therefore, they have higher throughputs Multiple parallel TCP connection allows one application to get a bigger 55 share of the bandwidth Transport Layer – TCP B

TCP latency modeling the time from when the client initiates a TCP connection until

TCP latency modeling the time from when the client initiates a TCP connection until when the client receives the requested object in its entirety Q: How long TCP connection establishment time data transfer delay Actual data transmission time Two cases to consider: WS/R > RTT + S/R: does it take to receive an object from a Web server? No data transfer delay An ACK for the first segment in window returns to the Sender before a window’s worth of data is sent WS/R < RTT + S/R: There’s data transfer delay Sender has to wait for an ACK after a window’s worth of data sent 56 Transport Layer – TCP B

TCP Latency Modeling SERVER CLIENT R bps – link’s transmission rate FILE Assumptions: O

TCP Latency Modeling SERVER CLIENT R bps – link’s transmission rate FILE Assumptions: O - Size of object in bits S – number of bits of MSS (max. segment size) Network is uncongested, with one link between end systems of rate R Cong. Win (fixed) determines the amount of data that can be sent No packet loss, no packet corruption, no retransmissions required Header overheads are negligible File to send = integer number of segments of size MSS Connection establishment, request messages, ACKs, TCP connectionestablishment segments have negligible transmission times Initial Threshold of TCP congestion mechanism is very big 57 Transport Layer – TCP B

TCP latency Modeling Case Analysis: STATIC CONGESTION WINDOW Case 1: WS/R > RTT +

TCP latency Modeling Case Analysis: STATIC CONGESTION WINDOW Case 1: WS/R > RTT + S/R: An ACK for the first segment in window returns to the Sender before a window’s worth of data is sent K = Number of Windows of data that cover the object K = O/WS Number of segments; Rounded up to the nearest integer e. g. O=256 bits, S=32 bits, W=4 Assume: W=4 segments Case 1: latency = 2 RTT + O/R 58 Transport Layer – TCP B

TCP latency Modeling Case Analysis: STATIC CONGESTION WINDOW Case 2: WS/R < RTT +

TCP latency Modeling Case Analysis: STATIC CONGESTION WINDOW Case 2: WS/R < RTT + S/R: Sender has to wait for an ACK after a window’s worth of data sent Number of Windows of data that cover the object K: = O/WS If there are k windows, sender will be stalled (k-1) times STALLED PERIOD Case 2: latency = 2 RTT + O/R + (K-1)[S/R + RTT - WS/R] Transport Layer – TCP 59 B

Case Analysis: DYNAMIC CONGESTION WINDOW STALLED PERIOD O/S=15 4 windows 60 Transport Layer –

Case Analysis: DYNAMIC CONGESTION WINDOW STALLED PERIOD O/S=15 4 windows 60 Transport Layer – TCP B

Case Analysis: DYNAMIC CONGESTION WINDOW • Let K be the number of windows that

Case Analysis: DYNAMIC CONGESTION WINDOW • Let K be the number of windows that cover the object. • We can express K in terms of the number of segments in the object as follows: Note: 61 Transport Layer – TCP B

Case Analysis: DYNAMIC CONGESTION WINDOW • From the time the server begins to transmit

Case Analysis: DYNAMIC CONGESTION WINDOW • From the time the server begins to transmit the kth window until the time the server receives an ACK for the first segment in the window • Transmission of kth window = • Stall Time = • Latency = 62 Transport Layer – TCP B

Case Analysis: DYNAMIC CONGESTION WINDOW • Let Q be the number of times the

Case Analysis: DYNAMIC CONGESTION WINDOW • Let Q be the number of times the server would stall if the object contained an infinite number of segments. • The actual number of times that the server stalls is P = min{ Q, K-1 }. 63 Transport Layer – TCP B

Case Analysis: DYNAMIC CONGESTION WINDOW • Let Q be the number of times the

Case Analysis: DYNAMIC CONGESTION WINDOW • Let Q be the number of times the server would stall if the object contained an infinite number of segments. • The actual number of times that the server stalls is P = min{ Q, K-1 }. • Closed-form expression for the latency: 64 Transport Layer – TCP B

Case Analysis: DYNAMIC CONGESTION WINDOW • Let Q be the number of times the

Case Analysis: DYNAMIC CONGESTION WINDOW • Let Q be the number of times the server would stall if the object contained an infinite number of segments. *Slow start will not significantly increase latency if RTT << O/R 65 Transport Layer – TCP B

 http: //www 1. cse. wustl. edu/~jain/cis 788 - 97/ftp/tcp_over_atm/index. htm#atm-features 66 Transport Layer

http: //www 1. cse. wustl. edu/~jain/cis 788 - 97/ftp/tcp_over_atm/index. htm#atm-features 66 Transport Layer – TCP B