Transport Control Protocol Outline TCP objectives revisited TCP
Transport Control Protocol Outline TCP objectives revisited TCP basics New algorithms for RTO calculation CS 640 1
TCP Overview • TCP is the most widely used Internet protocol – Web, Peer-to-peer, FTP, telnet, … • A two way, reliable, byte stream oriented end-to-end protocol – Includes flow and congestion control • Closely tied to the Internet Protocol (IP) • A focus of intense study for many years – Our goal is to understand the RENO version of TCP • RENO is most widely used TCP today • RFC 2001 (now expired) • RENO mainly specifies mechanisms for dealing with congestion CS 640 2
TCP Features • Connection-oriented • Byte-stream • Full duplex • Flow control: keep sender from overrunning receiver • Congestion control: keep sender from overrunning network – app writes bytes – TCP sends segments – app reads bytes • Reliable data transfer Application process … … Write bytes TCP Send buffer Segment Read bytes TCP Receive buffer Segment … Segment Transmit segments CS 640 3
Segment Format CS 640 4
Segment Format (cont) • Each connection identified with 4 -tuple: – (Src. Port, Src. IPAddr, Dsr. Port, Dst. IPAddr) • Sliding window + flow control – acknowledgment, Sequence. Num, Advertised. Winow Data(Sequence. Num) Sender • Flags Receiver Acknowledgment + Advertised. Window – SYN, FIN, RESET, PUSH, URG, ACK • Checksum is the same as UDP – pseudo header + TCP header + data CS 640 5
Sequence Numbers • 32 bit sequence numbers – Wrap around supported • TCP breaks byte stream from application into packets (limited by Max. Segment Size) • Each byte in the data stream is considered • Each packet has a sequence number – Initial number selected at connection time – Subsequent numbers indicate first data byte number in packet • ACK’s indicate next byte expected CS 640 6
Sequence Number Wrap Around Bandwidth T 1 (1. 5 Mbps) Ethernet (10 Mbps) T 3 (45 Mbps) FDDI (100 Mbps) STS-3 (155 Mbps) STS-12 (622 Mbps) STS-24 (1. 2 Gbps) Time Until Wrap Around 6. 4 hours 57 minutes 13 minutes 6 minutes 4 minutes 55 seconds 28 seconds • Protect against this by adding a 32 -bit timestamp to TCP header CS 640 7
Connection Establishment Active participant (client) Passive participant (server) SYN, Sequ ence. N K, AC + N Y S ACK, um = x = y, m u nce. N e u q x+1 Se o Ackno en m g d e wl wledg ment CS 640 t= =y+1 8
Connection Termination Active participant (server) FIN, S Passive participant (client) eque nce. N um = x x+1 = t n me g d e l y ow = n k m c u A nce. N e u q Se FIN, Ackno wledg ment =y + 1 CS 640 9
State Transition Diagram CLOSED Active open/SYN Passive open Close LISTEN SYN_RCVD SYN/SYN + ACK Send/SYN SYN/SYN + ACK Close/FIN SYN + ACK/ACK ESTABLISHED Close/FIN FIN/ACK FIN_WAIT_1 ACK FIN_WAIT_2 SYN_SENT CLOSE_WAIT FIN/ACK AC K + FI N /A C K FIN/ACK Close/FIN CLOSING ACK Timeout after two segment lifetimes TIME_WAIT CS 640 LAST_ACK CLOSED 10
Reliability in TCP • Checksum used to detect bit level errors • Sequence numbers used to detect sequencing errors – Duplicates are ignored – Reordered packets are reordered (or dropped) – Lost packets are retransmitted • Timeouts used to detect lost packets – Requires RTO calculation – Requires sender to maintain data until it is ACKed CS 640 11
Sliding Window Revisited Sending application Receiving application TCP Last. Byte. Written Last. Byte. Acked Last. Byte. Read Last. Byte. Sent • Sending side – Last. Byte. Acked < = Last. Byte. Sent – Last. Byte. Sent < = Last. Byte. Written – buffer bytes between Last. Byte. Acked and Last. Byte. Written Next. Byte. Expected Last. Byte. Rcvd • Receiving side – Last. Byte. Read < Next. Byte. Expected – Next. Byte. Expected < = Last. Byte. Rcvd +1 – buffer bytes between Next. Byte. Read and Last. Byte. Rcvd CS 640 12
Flow Control in TCP • Send buffer size: Max. Send. Buffer • Receive buffer size: Max. Rcv. Buffer • Receiving side – Last. Byte. Rcvd - Last. Byte. Read < = Max. Rcv. Buffer – Advertised. Window = Max. Rcv. Buffer - (Last. Byte. Rcvd Last. Byte. Read) • Sending side – Last. Byte. Sent - Last. Byte. Acked < = Advertised. Window – Effective. Window = Advertised. Window - (Last. Byte. Sent Last. Byte. Acked) – Last. Byte. Written - Last. Byte. Acked < = Max. Send. Buffer – block sender if (Last. Byte. Written - Last. Byte. Acked) + y > Max. Sender. Buffer • Always send ACK in response to arriving data segment • Persist sending one byte seg. when Advertised. Window = 0 CS 640 13
Keeping the Pipe Full • 16 -bit Advertised. Window controls amount of pipelining • Assume RTT of 100 ms • Add scaling factor extension to header to enable larger windows Bandwidth T 1 (1. 5 Mbps) Ethernet (10 Mbps) T 3 (45 Mbps) FDDI (100 Mbps) OC-3 (155 Mbps) OC-12 (622 Mbps) OC-24 (1. 2 Gbps) Delay x Bandwidth Product 18 KB 122 KB 549 KB 1. 2 MB 1. 8 MB 7. 4 MB 14. 8 MB CS 640 14
Making TCP More Efficient • Delayed acknowledgements – Delay for about 200 ms – Try to piggyback ACKs with data • Acknowledge every other packet – Many instances in transmission sequence which require an ACK • Don’t forget Nagle’s algorithm – Can be switched off CS 640 15
Karn/Partridge Algorithm for RTO Sender Receiver Sample. R TT inal Retr trans ansm Receiver Orig miss ion inal t Sample. R TT Orig Sender issio n ACK rans miss io n ACK Retr ansm issio n • Two degenerate cases with timeouts and RTT measurements – Solution: Do not sample RTT when retransmitting • After each retransmission, set next RTO to be double the value of the last – Exponential backoff is well known control theory method – Loss is most likely caused by congestion so be careful CS 640 16
Jacobson/ Karels Algorithm • In late ’ 80 s, Internet was suffering from congestion collapse • New Calculations for average RTT – Jacobson ’ 88 • Variance is not considered when setting timeout value – If variance is small, we could set RTO = Est. RTT – If variance is large, we may need to set RTO > 2 x Est. RTT • • New algorithm calculates both variance and mean for RTT Diff = sample. RTT - Est. RTT = Est. RTT + ( d x Diff) Dev = Dev + d ( |Diff| - Dev) – Initially settings for Est. RTT and Dev will be given to you – where d is a factor between 0 and 1 – typical value is 0. 125 CS 640 17
Jacobson/ Karels contd. • Time. Out = m x Est. RTT + f x Dev – where m = 1 and f = 4 • When variance is small, Time. Out is close to Est. RTT • When variance is large Dev dominates the calculation • Another benefit of this mechanism is that it is very efficient to implement in code (does not require floating point) • Notes – algorithm only as good as granularity of clock (500 ms on Unix) – accurate timeout mechanism important to congestion control (later) • These issues have been studied and dealt with in new RFC’s for RTO calculation. • TCP RENO uses Jacobson/Karels CS 640 18
- Slides: 18