TCP Overview RFCs 793 1122 1323 2018 2581

  • Slides: 29
Download presentation
TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581 r point-to-point: one sender, one receiver

TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581 r point-to-point: one sender, one receiver r connection-oriented: v exchange control msgs first to initialize sender & receiver state r full duplex data delivery: v bi-directional data flow over the same connection r reliable, in-order byte steam delivery v no “message boundaries” v sender & receiver must buffer data r flow controlled v Prevent sender from flooding receiver TCP control r Congestion controlled parameters(state) v Reduce potential jam in the network Socket Interface 4//26/05 application writes data application reads data TCP send buffer TCP receive buff 1 CS 118

What defines a TCP connection uses 4 values to define a connection (a communication

What defines a TCP connection uses 4 values to define a connection (a communication association) r TCP local-host-addr, local-port#, remote-host-addr, remote-port# r each of the two ends keeps state for on-going communication v sequence# for data sent, received, ack'ed, retransmission timer, flow & congestion window TCP UDP IP Ethernet 4//26/05 2 CS 118

Issues To Consider r packets may be lost, duplicated, re-ordered r packets can be

Issues To Consider r packets may be lost, duplicated, re-ordered r packets can be delayed arbitrarily long inside the network v the delay between two communicating ends is unknown beforehand may vary over time r port numbers can be reused later v a later connection must not mistake packets from an earlier connection as its own 4//26/05 3 CS 118

TCP segment format URG: urgent data (generally not used) ACK: ACK # field valid

TCP segment format URG: urgent data (generally not used) ACK: ACK # field valid PSH: push data now (generally not used) RST, SYN, FIN: connection estab. (setup, teardown commands) checksum (as in UDP) IP header source port # dest port # sequence number acknowledgement number head not len used U A P R S F checksum rcvr window size ptr to urgent data Options (variable length) counting by bytes of data # bytes rcvr willing to accept application data (variable length) 32 bits 4//26/05 4 CS 118

TCP Connection Establishment r initialize TCP control listen( ) variables: v Initial seq. #

TCP Connection Establishment r initialize TCP control listen( ) variables: v Initial seq. # used in each direction v Buffer size (rcv. Window) connect( ) #) (#) S 1: client host sends TCP SYN segment to server connection established v specifies initial seq # v Does not carry data 2: server receives SYN, replies with SYN_ACK and SYN control segment 3: client end sends SYN_ACK 4//26/05 SYN ( /SYN K C YN-A Three way handshake v server client SYN-A CK connection established May carry data 5 CS 118

TCP Connection Close r Either end can initiate the close of its end of

TCP Connection Close r Either end can initiate the close of its end of the connection at any time 1: one end (A) sends TCP FIN control segment to the other server client close( ) FIN 2: the other end (B) receives FIN, replies with FIN_ACK; when it’s ready to close too, send FIN ACK FIN- close( ) FIN 3: A receives FIN, replies with FINACK. FIN-AC K ? 4: B receives FIN_ACK, close connection what problem does A have? 4//26/05 B A connection closed 6 CS 118

the well-known “two-army problem” Blue army Red army Q: how can the 2 red

the well-known “two-army problem” Blue army Red army Q: how can the 2 red armies agree on an attack time? r. Fact: the last one who send a message does not whether the msg is delivered r. Basic rule: one cannot send an ACK to acknowledge an ACK 4//26/05 7 CS 118

TCP Connection Close 1: one end (A) sends TCP FIN control segment to the

TCP Connection Close 1: one end (A) sends TCP FIN control segment to the other B A server client close( ) 2: the other end (B) receives FIN, replies with FIN_ACK; when it’s ready to close too, send FIN 4: B receives FIN_ACK, close connection FIN-AC A Enters “timed wait”, waits for 2 min before deleting the connection state r Abort a connection: send “reset” to the other end, enter closed state immediately v 4//26/05 timed wait 3: A receives FIN, replies with ACK FINFIN connection closed close( ) K connection closed All data assumed lost 8 CS 118

TCP Connection Management (cont) wait 2 min TCP server lifecycle TCP client lifecycle 4//26/05

TCP Connection Management (cont) wait 2 min TCP server lifecycle TCP client lifecycle 4//26/05 9 CS 118

A I-finished(M) B TCP state-transition diagram CLOSED ACK (M+1) Active open/SYN Passive open Close

A I-finished(M) B TCP state-transition diagram CLOSED ACK (M+1) Active open/SYN Passive open Close LISTEN I-finished(N) ack(N+1) wait for 2 MSL before deleting the conn state SYN_RCVD SYN/SYN + ACK Send/SYN SYN/SYN + ACK Done ACK Close/FIN SYN + ACK/ACK ESTABLISHED Close/FIN FIN/ACK FIN_WAIT_1 ACK CLOSE_WAIT FIN/ACK AC K + FI N FIN_WAIT_2 /A C K FIN/ACK 4//26/05 SYN_SENT 10 Close/FIN CLOSING ACK Timeout after two segment lifetimes TIME_WAIT LAST_ACK CLOSED CS 118

How to Set TCP Retransmission Timer r TCP sets rxt timer based on measured

How to Set TCP Retransmission Timer r TCP sets rxt timer based on measured RTT Timeout! data ACK SRTT: Estimated. RTT SRTT= (1 - ) x SRTT + x Sample. RTT retrans. data Timeout r Setting retransmission timer: v SRTT retrans. plus “safety margin” Sample. RTT Timer= SRTT + 4 X rttvar 4//26/05 data ACK 11 CS 118

After obtain a new RTT sample: r difference = Sample. RTT - SRTT r

After obtain a new RTT sample: r difference = Sample. RTT - SRTT r SRTT = (1 - ) x SRTT + x Sample. RTT = SRTT + x difference r rttvar = (1 - ) x rttvar + x |difference| ) = rttvar + (|difference| - rttvar) r Retransmission Timer (RTO) = SRTT + 4 x rttvar Typically: = 1/8, = 1/4 4//26/05 12 CS 118

An Example Assuming SRTT = 500 msec, rttvar = 120, RTT(3)=600 ms, = |RTT

An Example Assuming SRTT = 500 msec, rttvar = 120, RTT(3)=600 ms, = |RTT - SRTT| = 100 ms SRTT = 500 + 0. 125 * 100 = 512. 5 rttvar = 120 + 0. 25 (100 - 120) = 115 RTO = SRTT + 4 * rttvar = 512. 5 + 460 = 972. 5 ms RTT(4)=650 ms, = |RTT - SRTT| =137 ms SRTT = 512 + 0. 125 * 137 = 529 rttvar = rttvar + 0. 25 (137 - 115) = 120 600 sender 3 650 4 receiver 4//26/05 13 CS 118

Example RTT estimation: 4//26/05 14 CS 118

Example RTT estimation: 4//26/05 14 CS 118

How to measure RTT in cases of retransmissions? Options r take the delay between

How to measure RTT in cases of retransmissions? Options r take the delay between first transmission and final ACK? r take the delay between last retransmission of segment(n) and ACK(n)? D S r Don’t measure? RTT? timeout 4//26/05 15 CS 118

Karn’s algorithm in case of retransmission r do not take the RTT sample (do

Karn’s algorithm in case of retransmission r do not take the RTT sample (do not update SRTT or rttvar) r double the retransmission timer value (RTO) after each timeout r Take RTT measure again upon next transmission (without retrans. ) 4//26/05 16 CS 118

One more question What initial SRTT, rttvar values to start with? r. Currently by

One more question What initial SRTT, rttvar values to start with? r. Currently by some engineered guessing rwhat if the guessed value too small? v. Unnecessary retransmissions rwhat if the guessed value too large? v. In case of first or first few packets being lost, wait longer than necessary before retransmission rcurrent practice initial SRTT value: 3 sec, rttvar 3 sec when get first RTT, SRTT RTT, rttvar=SRTT/2 4//26/05 17 CS 118

TCP’s seq. #s and ACK #s Seq. #: v The number of first byte

TCP’s seq. #s and ACK #s Seq. #: v The number of first byte in segment’s data ACK #: v seq # of next byte expected from other side v cumulative ACK Host B Host A Seq=4 2, ACK sends 10 byte =79, d ata data ta a 52, d = K C 9, A 7 Seq= host ACKs receipt of 5 B host B ACKs receipt of 10 B data from A, and sends 5 byte data Seq=5 2, ACK =84 A simple example 4//26/05 18 time CS 118

How to guarantee seq. # uniqueness r sequence#s will eventually wrap around r TCP

How to guarantee seq. # uniqueness r sequence#s will eventually wrap around r TCP assumes Maximum Segment Lifetime (MSL) of 120 sec. r make sure that for the same [src-addr, src-port, dest-addr, dest-port] tuple, the same sequence number does not get reused within 2 x. MSL v assure that no two different data segments can bear the same sequence number, as long as data’s life time < 120 sec. 4//26/05 19 CS 118

TCP: reliable data transfer simplified sender, assuming • one way data transfer • not

TCP: reliable data transfer simplified sender, assuming • one way data transfer • not flow/congestion control event: data received from application create, send segment wait for event: timeout for segment with seq # y retransmit segment event: ACK received, with ACK # y ACK processing 4//26/05 00 Send. Base = Initial_Seq. Number 01 Next. Seqnum = Initial_Seq. Number 02 03 loop (forever) { 04 switch(event) 05 event: data received from application above 06 create TCP segment with seq. number Next. Seq. Num 07 start timer for segment Sext. Seq. Num 08 pass segment to IP 09 Next. Seq. Num = Next. Seq. Num + length(data) 10 event: timer timeout for segment with seq. number y 11 retransmit segment with sequence number y 12 compute new timeout interval for segment y 13 restart timer 14 event: ACK received, with ACK field value of y 15 if (y > Send. Base) {/* cumulative ACK of all data up to y*/ 16 Send. Base = y 17 If (any outstanding not-yet-ack'ed segments) 18 Start timer } 19 else { /* a duplicate ACK for already ACKed segment */ 20 increment count of duplicate ACKs received for y 21 if (count of dup. ACKS received for y = 3) { 22 resend segment with sequence number y 23 reset dup. count 24 } 25 } /* end of loop forever */ 20 CS 118

Fast Retransmit r Time-out period often relatively long: v r If sender receives 3

Fast Retransmit r Time-out period often relatively long: v r If sender receives 3 ACKs for the same data, it supposes that segment after ACKed data was lost: long delay before resending lost packet r Detect lost segments via duplicate ACKs. Sender often sends many segments back-to-back v If segment is lost, there will likely be many duplicate ACKs. v 4//26/05 v 21 fast retransmit: resend segment before timer expires CS 118

TCP: retransmission scenarios Host A tes da ta =100 X ACK loss Seq=9 2,

TCP: retransmission scenarios Host A tes da ta =100 X ACK loss Seq=9 2, 8 by tes da ta 100 Sendbase = 100 Send. Base = 120 = ACK Send. Base = 100 time 4//26/05 Host B Seq=92 timeout 2, 8 by Send. Base = 120 time lost ACK scenario 22 Seq= 2, 8 by 100, tes da t a 20 by tes da ta 0 10 = K 120 = C K A AC Seq=9 2, 8 by Seq=92 timeout Seq=9 timeout Host A Host B tes da ta 20 K=1 AC premature timeout CS 118

TCP retransmission scenarios (more) Host A Host B Seq=9 2 Seq=5 9 Host B

TCP retransmission scenarios (more) Host A Host B Seq=9 2 Seq=5 9 Host B , 500 b ytes d a 2, 500 tes da ta =100 K C A 00 , 20 bytes data timeout Seq=1 X loss ta B data Seq=1 X 092, 5 00 B da Seq=1 592, 5 ta 00 B da ta =120 Seq=2 0 92, 50 0 B dat a ACK 592 Seq=5 92, 50 0 ACK B data timeout 2, 8 by Send. Base = 120 Host A time Fast RXT scenario Cumulative ACK scenario 4//26/05 23 CS 118

TCP Receiver: when to send ACK? Event TCP Receiver action in-order segment arrival, no

TCP Receiver: when to send ACK? Event TCP Receiver action in-order segment arrival, no gaps, everything earlier already ACKed delayed ACK: wait up to 500 ms, If nothing arrived, send ACK in-order segment arrival, no gaps, one delayed ACK pending immediately send one cumulative ACK out-of-order arrival: higher-thanexpect seq. #, gap detected send duplicate ACK, indicating seq. # of next expected byte arrival of segment that partially or completely fills a gap immediate ACK if segment starts at the lower end of the gap 4//26/05 24 CS 118

TCP Flow Control flow control Prevent sender from overrunning receiver’s buffer by transmitting too

TCP Flow Control flow control Prevent sender from overrunning receiver’s buffer by transmitting too much too fast receiver: informs sender of (dynamically changing) amount of free buffer space v Rcv. Window field in TCP header sender: keeps the amount of transmitted, un. ACKed data no more than most recently received Rcv. Window throughput = window-size bytes/sec RTT Special case: When Rcv. Window = 0 • sender can send a 1 -byte segment • receiver can respond with current size • receiver buffer eventually freed windown size increased 4//26/05 25 CS 118

Design Choice: Counting bytes or counting packets? pro’s of counting bytes: flexibility r need

Design Choice: Counting bytes or counting packets? pro’s of counting bytes: flexibility r need a byte counter somewhere anyway r can repackage data for retransmission v e. g. first sent segment-1 with 200 bytes v 300 more bytes are passed down from application v Segment-1 times out, send new segment with 500 byte data 200 4//26/05 300 26 CS 118

Counting Bytes: con's r sequence number runs out faster v needs a larger sequence#

Counting Bytes: con's r sequence number runs out faster v needs a larger sequence# field r easily fall into traps of transmitting small packets v network overhead goes up with the number of packets transmitted v silly window syndrome: receiver ACKed a single byte, causing sender to send single byte segment forever 4//26/05 27 CS 118

Design Choices: Understand the consequence of the design r TCP sequence number: 32 bits

Design Choices: Understand the consequence of the design r TCP sequence number: 32 bits 4 Gbytes v wrap-around time: • • 50 Kbps: ~20 hours Ethernet (10 Mbps): about an hour FDDI (100 Mbps): 6 minutes at 1 Gbps: about 30 seconds r TCP window size: 16 -bits 64 Kbytes max assume RTT = 100 msec v can keep a channel of 5 Mbps fully utilized v OC 3(155 Mbps) x 100 msec = 1. 9 MB, need a window size at least 21 bits v 1 Gbps x 100 msec = 4//26/05 28 CS 118

Always Keeps the Big Picture in Mind M Ht M Hn Ht M Hl

Always Keeps the Big Picture in Mind M Ht M Hn Ht M Hl Hn Ht M application transport network link physical Web server Web browser HTTP Socket interface TCP Unreliable network data packet delivery Application process Write bytes TCP Send buffer Receive buffer segment 4//26/05 Read bytes 29 segment CS 118