Week 10 Transport Protocols UDP TCP 1 Orientation

  • Slides: 64
Download presentation
Week 10 Transport Protocols, UDP, TCP 1

Week 10 Transport Protocols, UDP, TCP 1

Orientation r We move one layer up and look at the transport layer across

Orientation r We move one layer up and look at the transport layer across the Internet. 2

Orientation r TCP and UDP are end-to-end protocols r They are only implemented at

Orientation r TCP and UDP are end-to-end protocols r They are only implemented at the hosts 3

Transport Protocols in the Internet • The Internet supports 2 transport protocols UDP -

Transport Protocols in the Internet • The Internet supports 2 transport protocols UDP - User Datagram Protocol r datagram oriented TCP - Transmission Control Protocol r unreliable, connectionless r stream oriented r simple r reliable, connection-oriented r unicast and multicast r complex r useful for multimedia r only unicast applications r used for control protocols m network management (SNMP), routing (RIP), naming (DNS), etc. r used for data applications: m web (http), email (smtp), file transfer (ftp), Secure. CRT, etc. 4

UDP - User Datagram Protocol r UDP extends the host-to-to-host delivery service of IP

UDP - User Datagram Protocol r UDP extends the host-to-to-host delivery service of IP to an application process-to-application process delivery service r It does this by multiplexing and demultiplexing packets from multiple application-to-application communication sessions 5

UDP packet format • Port numbers identify sending and receiving applications (processes). Maximum port

UDP packet format • Port numbers identify sending and receiving applications (processes). Maximum port number is 216 -1= 65, 535 • Message Length is between 8 bytes (i. e. , data field can be empty) and 65, 535 bytes (length of UDP header and data in bytes) • Checksum is for UDP header and UDP data 6

Port Numbers r UDP (and TCP) use port numbers to identify applications r There

Port Numbers r UDP (and TCP) use port numbers to identify applications r There are 65, 535 UDP ports per host. 7

TCP r Service offered by TCP r TCP Header r TCP Connection Establishment and

TCP r Service offered by TCP r TCP Header r TCP Connection Establishment and Termination r Flow control r Error control r Congestion control 8

TCP = Transmission Control Protocol r Provides a reliable unicast end-to-end byte stream over

TCP = Transmission Control Protocol r Provides a reliable unicast end-to-end byte stream over an unreliable internetwork. 9

TCP is reliable • Byte stream is broken up into chunks which are called

TCP is reliable • Byte stream is broken up into chunks which are called segments • Detecting errors: • TCP has checksums for header and data. Segments with invalid checksums are discarded • Each segment that is transmitted has a sequence number. • Receiver sends acknowledgments (ACKs) for segments • Sender maintains a timer. An ACK is expected before the timer times out • Correcting errors: • Lost or errored segments are retransmitted. • Selective repeat ARQ scheme • Cumulative ACKs 10

Byte Stream Service r To the lower layers, TCP handles data in "segments" r

Byte Stream Service r To the lower layers, TCP handles data in "segments" r To the higher layers TCP handles data as a sequence of bytes and does not identify boundaries between bytes r So: Higher layers do not know about the beginning and end of segments ! 11

TCP r Service offered by TCP Ø TCP Header r TCP Connection Establishment and

TCP r Service offered by TCP Ø TCP Header r TCP Connection Establishment and Termination r Flow control r Error control r Congestion control 12

TCP Format • TCP segments have a 20 byte plus options header with >=

TCP Format • TCP segments have a 20 byte plus options header with >= 0 data bytes reserved 13

TCP header fields - Port Numbers r Port Number: • A port number identifies

TCP header fields - Port Numbers r Port Number: • A port number identifies the endpoint of a connection. • A pair <IP address, port number> identifies one endpoint of a connection. • Two pairs <client IP address, client port number> and <server IP address, server port number> identify a TCP connection. 14

TCP header fields - Sequence Number r Sequence Number (Seq. No): m Sequence number

TCP header fields - Sequence Number r Sequence Number (Seq. No): m Sequence number is 32 bits long. m So the range of Seq. No is 0 <= Seq. No <= 232 -1 4. 3 Gbyte m m Each sequence number identifies the byte in the stream of data from the sending TCP to the receiving TCP that the first byte of data in this segment represents. Initial Sequence Number (ISN) of a connection is set during connection establishment 15

TCP header fields - Ack. No. r Acknowledgment Number (Ack. No): m Acknowledgments are

TCP header fields - Ack. No. r Acknowledgment Number (Ack. No): m Acknowledgments are piggybacked, i. e. , a segment from A B contains an acknowledgement for a segment sent in the B A direction m m The Ack. No in the B A segment header contains the Seq. No for the next segment expected at B for the A B flow Example: The acknowledgment for a 1500 -byte segment with the sequence number 0 is Ack. No=1500 A host uses the Ack. No field to send acknowledgements. If a host sends an Ack. No in a segment it sets the “ACK flag” 16

TCP header fields - Ack. No. Contd. r Example: Sender sends two segments with

TCP header fields - Ack. No. Contd. r Example: Sender sends two segments with bytes “ 1. . 1500” and “ 1501. . 3000”, but receiver only gets the second segment. • What is the sequence number of the first segment? • What is the sequence number of the second segment? • What is the ACK number sent in response by the receiver when it receives the second segment? 17

TCP header fields - Header Length r Header Length (4 bits): m Length of

TCP header fields - Header Length r Header Length (4 bits): m Length of header in 32 -bit words m Note that TCP header has variable length (minimum of 20 bytes) 18

TCP header fields - Flags r Flag bits: m URG: Urgent pointer is valid

TCP header fields - Flags r Flag bits: m URG: Urgent pointer is valid – If the bit is set, the following bytes contain an urgent message in the range: Seq. No <= urgent message <= Seq. No+urgent pointer m ACK: Acknowledgement Number is valid m PSH: PUSH Flag – Notification from sender to the receiver that the receiver should pass all data that it has to the application as soon as possible. – Normally set by sender when the sender’s buffer is empty (so TCP does not wait expecting more data) 19

TCP header fields - Flags Contd. r Flag bits: m RST: Reset the connection

TCP header fields - Flags Contd. r Flag bits: m RST: Reset the connection – The flag causes the receiver to reset the connection – Receiver of a RST terminates the connection and indicates higher layer application about the reset m SYN: Synchronize sequence numbers – Sent in the first packet when opening a connection m FIN: Sender is finished with sending – Used for closing a connection – Both sides of a connection must send a FIN 20

TCP header fields r Window Size: m m m Each side of the connection

TCP header fields r Window Size: m m m Each side of the connection advertises its receiving window size Window size is the maximum number of bytes that a receiver can accept. Maximum window size is 216 -1= 65535 bytes r TCP Checksum: m TCP checksum covers both TCP header and TCP data r Urgent Pointer: m Only valid if URG flag is set 21

TCP header fields - Options r Options - a few examples: 22

TCP header fields - Options r Options - a few examples: 22

TCP header fields r Options: m NOP is used to pad TCP header to

TCP header fields r Options: m NOP is used to pad TCP header to a multiple of 4 bytes m Maximum Segment Size: • Sets the maximum length of the segments • This option can only appear in a SYN segment 23

TCP r Service offered by TCP r TCP Header Ø TCP Connection Establishment and

TCP r Service offered by TCP r TCP Header Ø TCP Connection Establishment and Termination r Flow control r Error control r Congestion control 24

Connection Management in TCP r Opening a TCP Connection r Closing a TCP Connection

Connection Management in TCP r Opening a TCP Connection r Closing a TCP Connection r Special Scenarios r State Diagram 25

TCP Connection Establishment r TCP uses a three-way handshake to open a connection: (1)

TCP Connection Establishment r TCP uses a three-way handshake to open a connection: (1) ACTIVE OPEN: Client sends a segment with – SYN bit set – port number of client, port number of server – initial sequence number (ISN) of client (2) PASSIVE OPEN: Server responds with a segment with – SYN bit set – initial sequence number of server – ACK for ISN of client (3) Client acknowledges by sending a segment with: – ACK ISN of server 26

Three-Way Handshake 27

Three-Way Handshake 27

A Closer Look with tcpdump 1 aida. poly. edu. 1121 > mng. poly. edu.

A Closer Look with tcpdump 1 aida. poly. edu. 1121 > mng. poly. edu. telnet: S 1031880193: 1031880193(0) win 16384 <mss 1460, nop, wscale 0, nop, timestamp> 2 mng. poly. edu. telnet > aida. poly. edu. 1121: S 172488586: 172488586(0) ack 1031880194 win 8760 <mss 1460> 3 aida. poly. edu. 1121 > mng. poly. edu. telnet: . ack 172488587 win 17520 4 aida. poly. edu. 1121 > mng. poly. edu. telnet: P 1031880194: 1031880218(24) ack 172488587 win 17520 5 mng. poly. edu. telnet > aida. poly. edu. 1121: P 172488587: 172488590(3) ack 1031880218 win 8736 6 aida. poly. edu. 1121 > mng. poly. edu. telnet: P 1031880218: 1031880221(3) ack 172488590 win 17520 28

Three-Way Handshake 29

Three-Way Handshake 29

First data segment sequence number r Note that the data segment following the three-way

First data segment sequence number r Note that the data segment following the three-way handshake will start with the sequence number following that of the SYN segment 30

Why to start with a new ISN r The problem with starting off each

Why to start with a new ISN r The problem with starting off each connection with a sequence number of 1 is that it introduces the possibility of segments from different connections getting mixed up. r Traditionally, each device chose the ISN by making use of a timed counter, like a clock of sorts, that was incremented every 4 microseconds. This counter was initialized when TCP started up and then its value increased by 1 every 4 microseconds until it reached the largest 32 -bit value possible (4, 294, 967, 295) at which point it “wrapped around” to 0 and resumed incrementing. r Period: 4 hours 31

TCP Connection Termination r Each end of the data flow must be shut down

TCP Connection Termination r Each end of the data flow must be shut down independently (“half-close”) r If one end is done it sends a FIN segment. This means that no more data will be sent r Four steps involved: (1) X sends a FIN to Y (active close) (2) Y ACKs the FIN, (at this time: Y can still send data to X) (3) and Y sends a FIN to X (passive close) (4) X ACKs the FIN. 32

Connection termination with tcpdump 1 mng. poly. edu. telnet > aida. poly. edu. 1121:

Connection termination with tcpdump 1 mng. poly. edu. telnet > aida. poly. edu. 1121: F 172488734: 172488734(0) ack 1031880221 win 8733 2 aida. poly. edu. 1121 > mng. poly. edu. telnet: . ack 172488735 win 17484 3 aida. poly. edu. 1121 > mng. poly. edu. telnet: F 1031880221: 1031880221(0) ack 172488735 win 17520 4 mng. poly. edu. telnet > aida. poly. edu. 1121: . ack 1031880222 win 8733 33

TCP Connection Termination 34

TCP Connection Termination 34

TCP Half-close FIN ACK of FIN DATA ACK of DATA FIN ACK of FIN

TCP Half-close FIN ACK of FIN DATA ACK of DATA FIN ACK of FIN 35

MSS B A MTU = 1500 MTU = 296 C SYN <mss 1460> SYN

MSS B A MTU = 1500 MTU = 296 C SYN <mss 1460> SYN <mss 256> Default is generally 536 bytes 36

Difference between TCP connections and connections in a connection-oriented network r TCP “connections” are

Difference between TCP connections and connections in a connection-oriented network r TCP “connections” are not the same as connections in a connection-oriented network r In a connection-oriented network, a signaling procedure is used to reserve bandwidth for the connection on every link of the end-to-end path (e. g. , circuit-switched networks) r A TCP connection involves the maintenance of state information at the end hosts m m Purpose is to provide error correction for TCP segments Initial sequence number exchanged to avoid accidentally sending data to an old connection 37

TCP r Service offered by TCP r TCP Header r TCP Connection Establishment and

TCP r Service offered by TCP r TCP Header r TCP Connection Establishment and Termination Ø Flow control r Error control r Congestion control 38

TCP flow control • Flow Control: How to prevent the sender from overrunning the

TCP flow control • Flow Control: How to prevent the sender from overrunning the receiver buffer? • Flow Control in TCP • TCP implements sliding window flow control • Window size is usually sent within acknowledgements. 39

Window Management in TCP r The receiver returns two parameters to the sender in

Window Management in TCP r The receiver returns two parameters to the sender in an ACK r The interpretation is: • I am ready to receive new data with Seq. No= Ack. No, Ack. No+1, …. , Ack. No+Win-1 r Receiver can acknowledge data without opening the window r Receiver can change the window size without acknowledging data 40

TCP Flow Control r receive side of TCP connection has a receive buffer: flow

TCP Flow Control r receive side of TCP connection has a receive buffer: flow control sender won’t overflow receiver’s buffer by transmitting too much, too fast r speed-matching r app process may be slow at reading from buffer service: matching the send rate to the receiving app’s drain rate 41

TCP Flow control: how it works r Rcvr advertises spare room by including value

TCP Flow control: how it works r Rcvr advertises spare room by including value of Rcv. Window in segments (Suppose TCP receiver discards out-of-order segments) r spare room in buffer r Sender limits un. ACKed data to Rcv. Window m guarantees receive buffer doesn’t overflow = Rcv. Window = Rcv. Buffer-[Last. Byte. Rcvd Last. Byte. Read] 42

Sliding windows Offered window advertised by receiver 1 2 3 Sent and Acknow. 4

Sliding windows Offered window advertised by receiver 1 2 3 Sent and Acknow. 4 5 6 7 8 9 10 Sent not Usable window: acked Can send ASAP 11 … Can’t send until window moves 43

Sliding Window: Example 44

Sliding Window: Example 44

Sliding Window: In-class example Sender Receiver 4 K bytes win 4096 How many more

Sliding Window: In-class example Sender Receiver 4 K bytes win 4096 How many more segments can it send now? 3 segments Sequence number: Is 1025 carried in TCP header? Is 1024 carried in TCP header? What is 1024? NOTATION 1: 1025(1024) 1025: 2049(1024) 4 K bytes 2049: 3073(1024) 3073: 4097(1024) 1 K ack 1025 win 3072 How many segments can it send now? 45

Sliding Window: In-class example answers Receiver Sender 4 K bytes win 4096 How many

Sliding Window: In-class example answers Receiver Sender 4 K bytes win 4096 How many more segments can it send now? 3 segments 1: 1025(1024) 1025: 2049(1024) 4 K bytes 2049: 3073(1024) 3073: 4097(1024) 1 K ack 1025 win 3072 How many segments can it send now? 0 46

Silly Window Syndrome r Let's say that the server is only able to remove

Silly Window Syndrome r Let's say that the server is only able to remove 1 byte of data from the buffer for every 3 it receives. r Let's say it also removes 40 additional bytes from the buffer during the time it takes for the next client's segment to arrive. r In the worst case, the client then sends a segment with exactly one byte, refilling the buffer until the application draws off the next byte. 47

TCP r Service offered by TCP r TCP Header r TCP Connection Establishment and

TCP r Service offered by TCP r TCP Header r TCP Connection Establishment and Termination r Flow control Ø Error control r Congestion control 48

TCP error control r ARQ scheme with positive cumulative ACKs r Delayed ACKs: m

TCP error control r ARQ scheme with positive cumulative ACKs r Delayed ACKs: m TCP delays transmission of ACKs for up to 200 ms m The hope is to have data ready in that time frame. Then, the ACK can be piggybacked with the data segment. 49

Delayed ACK timer r This timer ticks every 200 ms. r First timeout occurs

Delayed ACK timer r This timer ticks every 200 ms. r First timeout occurs based on when the timer was initialized, which is when the system was rebooted. r The figure below explains why the delay for the ACKdelay is UP TO 200 ms (and not equal to 200 ms). 50

TCP Retransmission Timer r Retransmission Timer: m The setting of the retransmission timer is

TCP Retransmission Timer r Retransmission Timer: m The setting of the retransmission timer is crucial for efficiency m Timeout value too small -> results in unnecessary retransmissions m Timeout value too large -> long waiting time before a retransmission can be issued m. A problem is that the delays in the network are not fixed m Therefore, adaptive the retransmission timers must be 51

Measuring TCP Retransmission Timers • Transfer file from aida to rigoletto • Unplug Ethernet

Measuring TCP Retransmission Timers • Transfer file from aida to rigoletto • Unplug Ethernet cable in the middle of file transfer 52

tcpdump Trace 10: 42: 01. 704681 10: 42: 01. 705603 10: 42: 01. 706753

tcpdump Trace 10: 42: 01. 704681 10: 42: 01. 705603 10: 42: 01. 706753 10: 42: 02. 741764 10: 42: 05. 741788 10: 42: 11. 741828 10: 42: 23. 741951 10: 42: 47. 742176 10: 43: 35. 742587 10: 44: 39. 743140 10: 45: 43. 743702 10: 46: 47. 744271 10: 47: 51. 752138 10: 48: 55. 745547 10: 49: 59. 746123 10: 51: 03. 745839 aida. 40001 aida. 40001 > > > > rigoletto. ftp-data: rigoletto. ftp-data: . 161189: 162649(1460) ack 1 win 17520. 162649: 164109(1460) ack 1 win 17520. 164109: 165569(1460) ack 1 win 17520. 161189: 162649(1460) ack 1 win 17520 R 165569: 165569(0) ack 1 win 17520 53

Interpreting the Measurements r The interval between retransmission attempts in seconds is: 1. 03,

Interpreting the Measurements r The interval between retransmission attempts in seconds is: 1. 03, 3, 6, 12, 24, 48, 64, 64, 64, 64. r Time between retransmissions is doubled each time (Exponential Backoff Algorithm) r Timer is not increased beyond 64 seconds r TCP gives up after 13 th attempt and 9 minutes (total timeout, tcp_ip_abort_interval is 2 mins in Solaris and can be programmed by administrator 9 mins is the commonly used old timeout value) 54

TCP timers r First timeout occurs based on when timer was initialized. r This

TCP timers r First timeout occurs based on when timer was initialized. r This explains why the first timeout occurs at 1. 03 sec and not 1. 5. r If the base timer clock is 500 ms, the first timeout occurs after 3 timer ticks. This happens to occur at 1. 03 sec after first segment was sent. Subsequent retransmissions occur at 3 sec, 6 sec, 12 sec, etc. 55

Adaptive mechanism r The retransmission mechanism of TCP is adaptive r The retransmission timers

Adaptive mechanism r The retransmission mechanism of TCP is adaptive r The retransmission timers are set based on round-trip time (RTT) measurements that TCP performs r The RTT is based on time difference between segment transmission and ACK r But: m m m TCP does not ACK each segment Can’t start a second RTT measurement if timing on one segment is in progress Each connection has only one timer 56

Computation of RTO in adaptive scheme r Retransmission timer is set to a Retransmission

Computation of RTO in adaptive scheme r Retransmission timer is set to a Retransmission Timeout (RTO) value. r RTO is calculated based on the RTT measurements. r The RTT measurements are smoothed by the following estimators A (mean RTT value) and D (smoothed mean deviation of RTT): Err = M - A A A+ g Err=A(1 -g)+g. M D D+ h (|Err|-D)=D(1 -h)+ h|Err| RTO = A + 4 D The gains are set to h=1/4 and g=1/8 – In the formula for computing the new smoothed mean RTT A, 0. 125 times the newly measured value (M) is added to 0. 875 times the old smoothed value of A 57

In-class example r Assume A=1, D=1 (initial values) RTO= ? RTO=? RTO= ? 58

In-class example r Assume A=1, D=1 (initial values) RTO= ? RTO=? RTO= ? 58

Example of RTO computation (adaptive) r Assume A=1, D=1 (initial values) • Err =

Example of RTO computation (adaptive) r Assume A=1, D=1 (initial values) • Err = 2 -1 =1 (since M, the measured RTT is 2) • A = 1 + 0. 125× 1= 1. 125; D = 1+0. 25 (1 -1)=1 • RTO = A+4 D=1. 125+4 = 5. 125 • This is why in the figure below when segment 2 is lost, it is retransmitted after 5. 125 sec. 59

In-class example r Assume A=1, D=1 (initial values) RTO=A+4 D=5. 125 (adaptive: new A

In-class example r Assume A=1, D=1 (initial values) RTO=A+4 D=5. 125 (adaptive: new A = 1. 125; D=1) RTO=10. 25 (doubling) RTO=10. 25 (Karn's algorithm) 5. 125 sec since that is the retransmission timer value 60

Karn’s Algorithm r If an ACK for a retransmitted segment is received, the sender

Karn’s Algorithm r If an ACK for a retransmitted segment is received, the sender cannot tell if the ACK belongs to the original or the retransmission. r The RTT measurement started for the original transmission should be terminated. r There will be no RTT measurement for the original or retransmitted segment r Therefore A and D cannot be updated when the ACK is received, and hence no new RTO computation at this point. r Don’t confuse this with the RTO being doubled when the segment is retransmitted following the exponential doubling rule. • RTT measurement is suspended • RTO is doubled 61

In-class example r At t 1: RTO = 6 sec; A = 2; D

In-class example r At t 1: RTO = 6 sec; A = 2; D = 1 r At t 2: RTO= ? r At t 3: RTO = ? 3 sec 62

In-class example r At t 1: RTO = 6 sec; A = 2; D

In-class example r At t 1: RTO = 6 sec; A = 2; D = 1 r At t 2: RTO= 12 sec (doubling) r At t 3: RTO = 12 sec (Karn's algorithm) 3 sec 63

Thus there are two schemes for determining RTO and two schemes for controlling RTT

Thus there are two schemes for determining RTO and two schemes for controlling RTT measurement r RTO m Exponential backoff if a segment is retransmitted m Adaptive RTO as a function of RTT (A+4 D) • RTT measurement is in progress and a new segment sent then no RTT measurement is taken for new segment r RTT measurement m Karn’s algorithm • no RTT measurement on retransmitted segment m Can’t start a second RTT measurement if timing on one segment is in progress 64