Transport Layer v Transport Layer Services v v

  • Slides: 76
Download presentation
Transport Layer v Transport Layer Services v v connection-oriented vs. connectionless multiplexing and demultplexing

Transport Layer v Transport Layer Services v v connection-oriented vs. connectionless multiplexing and demultplexing UDP: Connectionless Unreliable Service TCP: Connection-Oriented Reliable Service v v v connection management: set-up and tear down reliable data transfer protocols flow and congestion control Readings: Chapter 5 (5. 1, 5. 2) Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 1

Transport Protocols • Lowest level end-toend protocol. – Header generated by sender is interpreted

Transport Protocols • Lowest level end-toend protocol. – Header generated by sender is interpreted only by the destination – Routers view transport header as part of the payload 7 7 6 6 5 5 Transport IP IP IP Datalink 2 2 Datalink Physical 1 1 Physical router Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 2

Transport Services and Protocols application transport network data link physical al d en d-

Transport Services and Protocols application transport network data link physical al d en d- en – send side: breaks app messages into segments, passes to network layer – rcv side: reassembles segments into messages, passes to app layer network data link physical ic g lo • provide logical communication between app processes running on different hosts • transport protocols run in end systems network data link physical po s an tr network data link physical rt application transport network data link physical • more than one transport protocol available to apps – Internet: TCP and UDP Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 3

Transport Layer Services • Underlying best-effort network – – drops messages re-orders messages delivers

Transport Layer Services • Underlying best-effort network – – drops messages re-orders messages delivers duplicate copies of a given message delivers messages after an arbitrarily long delay • Common end-to-end services – – – guarantee message delivery deliver messages in the same order they are sent deliver at most one copy of each message allow the receiver to flow control the sender support multiple application processes on each host Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 4

Transport vs. Application and Network Layer • application layer: application processes and message exchange

Transport vs. Application and Network Layer • application layer: application processes and message exchange • network layer: logical communication between hosts • transport layer: logical communication support for app processes – relies on, enhances, network layer services Csci 183/183 W/232 – Computer Networks Household analogy: 12 kids sending letters to 12 kids • processes = kids • app messages = letters in envelopes • hosts = houses • transport protocol = Ann and Bill • network-layer protocol = postal service Transport Layer & TCP 5

End to End Issues • Transport services built on top of (potentially) unreliable network

End to End Issues • Transport services built on top of (potentially) unreliable network service – packets can be corrupted or lost – Packets can be delayed or arrive “out of order” • Do we detect and/or recover errors for apps? – Error Control & Reliable Data Transfer • Do we provide “in-order” delivery of packets? – Connection Management & Reliable Data Transfer • Potentially different capacity at destination, and potentially different network capacity – Flow and Congestion Control Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 6

Internet Transport Protocols TCP service: • connection-oriented: setup required between client, server • reliable

Internet Transport Protocols TCP service: • connection-oriented: setup required between client, server • reliable transport between sender and receiver • flow control: sender won’t overwhelm receiver • congestion control: throttle sender when network overloaded UDP service: • unreliable data transfer between sender and receiver • does not provide: connection setup, reliability, flow control, congestion control Both provide logical communication between app processes running on different hosts! Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 7

Multiplexing/Demultiplexing Multiplexing at send host: gathering data from multiple app processes, enveloping data with

Multiplexing/Demultiplexing Multiplexing at send host: gathering data from multiple app processes, enveloping data with header (later used for demultiplexing) Demultiplexing at rcv host: delivering received segments to correct application process = API (“socket”) application P 3 transport network link = process P 1 application P 2 transport network P 4 application transport network link physical host 1 Csci 183/183 W/232 – Computer Networks physical host 2 Transport Layer & TCP physical host 3 8

How Demultiplexing Works 32 bits • host receives IP datagrams – each datagram has

How Demultiplexing Works 32 bits • host receives IP datagrams – each datagram has source IP address, destination IP address – each datagram carries 1 transport-layer segment – each segment has source, destination port number (recall: well-known port numbers for specific applications) • host uses IP addresses & port numbers to direct segment to appropriate app process (identified by “socket’) Csci 183/183 W/232 – Computer Networks source port # dest port # other header fields application data (message) TCP/UDP segment format Transport Layer & TCP 9

UDP: User Datagram Protocol • “no frills, ” “bare bones” Internet transport protocol •

UDP: User Datagram Protocol • “no frills, ” “bare bones” Internet transport protocol • “best effort” service, UDP segments may be: Why is there a UDP? – lost – delivered out of order to app • connectionless: – no handshaking between UDP sender, receiver – each UDP segment handled independently of others Csci 183/183 W/232 – Computer Networks [RFC 768] • no connection establishment (which can add delay) • simple: no connection state at sender, receiver • small segment header • no congestion control: UDP can blast away as fast as desired Transport Layer & TCP 10

UDP (cont’d) • often used for streaming multimedia apps Length, in – loss tolerant

UDP (cont’d) • often used for streaming multimedia apps Length, in – loss tolerant – rate sensitive • other UDP – DNS – SNMP bytes of UDP segment, including users header 32 bits source port # dest port # length checksum Application data (message) • reliable transfer over UDP: add reliability at application layer – application-specific error recovery! Csci 183/183 W/232 – Computer Networks UDP segment format Transport Layer & TCP 11

UDP Checksum Goal: detect “errors” (e. g. , flipped bits) in transmitted segment Sender:

UDP Checksum Goal: detect “errors” (e. g. , flipped bits) in transmitted segment Sender: • treat segment contents as sequence of 16 -bit integers • checksum: addition (1’s complement sum) of segment contents • sender puts checksum value (1’s complement of 1’s complement sum of 16 bit words) into UDP checksum field Csci 183/183 W/232 – Computer Networks Receiver: • • compute checksum of received segment check if computed checksum equals checksum field value: – NO - error detected – YES - no error detected. But maybe errors nonetheless? More later …. Transport Layer & TCP 12

Checksum: Example arrange data segment in sequences of 16 -bit words + 01100110 1101010101

Checksum: Example arrange data segment in sequences of 16 -bit words + 01100110 1101010101 00001111 sum: 0100101011001011 checksum(1’s complement): 1011010100110100 verify by adding: 11111111 Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 13

TCP Overview • Full duplex • Flow control: keep sender from • Connection-oriented •

TCP Overview • Full duplex • Flow control: keep sender from • Connection-oriented • Byte-stream overrunning receiver – app writes bytes – TCP sends segments – app reads bytes • Congestion control: keep sender from overrunning network Application process … … Write bytes TCP Send buffer Receive buffer Segment … Read bytes Segment Transmit segments Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 14

Functionality Split • Network provides best-effort delivery • End-systems implement many functions – –

Functionality Split • Network provides best-effort delivery • End-systems implement many functions – – – – Reliability In-order delivery Demultiplexing Message boundaries Connection abstraction Flow Control Congestion control … Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 15

High-Level TCP Characteristics • Protocol implemented entirely at the ends – Fate sharing •

High-Level TCP Characteristics • Protocol implemented entirely at the ends – Fate sharing • Protocol has evolved over time and will continue to do so – – Nearly impossible to change the header Use options to add information to the header Change processing at endpoints Backward compatibility is what makes it TCP Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 16

Evolution of TCP 1984 Nagel’s algorithm to reduce overhead of small packets; predicts congestion

Evolution of TCP 1984 Nagel’s algorithm to reduce overhead of small packets; predicts congestion collapse 1975 Three-way handshake Raymond Tomlinson In SIGCOMM 75 1983 BSD Unix 4. 2 supports TCP/IP 1974 TCP described by Vint Cerf and Bob Kahn In IEEE Trans Comm 1986 Congestion collapse observed 1982 TCP & IP RFC 793 & 791 1975 1980 Csci 183/183 W/232 – Computer Networks 1987 Karn’s algorithm to better estimate round-trip time 1985 Transport Layer & TCP 1990 4. 3 BSD Reno fast retransmit delayed ACK’s 1988 Van Jacobson’s algorithms congestion avoidance and congestion control (most implemented in 4. 3 BSD Tahoe) 1990 17

TCP Through the 1990 s 1994 T/TCP (Braden) Transaction TCP 1993 TCP Vegas (Brakmo

TCP Through the 1990 s 1994 T/TCP (Braden) Transaction TCP 1993 TCP Vegas (Brakmo et al) real congestion avoidance 1993 1994 ECN (Floyd) Explicit Congestion Notification 1994 Csci 183/183 W/232 – Computer Networks 1996 SACK TCP (Floyd et al) Selective Acknowledgement 1996 Hoe Improving TCP startup 1996 FACK TCP (Mathis et al) extension to SACK 1996 Transport Layer & TCP 18

TCP Segment Header Structure 32 bits URG: urgent data (generally not used) ACK: ACK

TCP Segment Header Structure 32 bits URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now (generally not used) source port # dest port # sequence number acknowledgement number head not UA P R S F len used RST, SYN, FIN: connection estab (setup, teardown commands) Internet checksum (as in UDP) Csci 183/183 W/232 – Computer Networks checksum counting by bytes of data (not segments!) rcvr window size ptr urgent data Options (variable length) # bytes rcvr willing to accept application data (variable length) Transport Layer & TCP 19

TCP Segment Format (cont) • Each connection identified with 4 -tuple: – (Src. Port,

TCP Segment Format (cont) • Each connection identified with 4 -tuple: – (Src. Port, Src. IPAddr, Dst. Port, Dst. IPAddr) • Sliding window + flow control – acknowledgment, Sequence. Num, Advertised. Winow Data(Sequence. Num) Sender • Flags Receiver Acknowledgment + Advertised. Window – SYN, FIN, ACK, RESET, PUSH, URG • Checksum – pseudo header (src & dst IP addresses) + TCP header + data Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 20

TCP Connection Set Up Three way handshake: TCP sender, receiver establish Step 1: client

TCP Connection Set Up Three way handshake: TCP sender, receiver establish Step 1: client sends TCP SYN “connection” before control segment to server exchanging data segments – specifies initial seq # • initialize TCP variables: – seq. # – buffers, flow control info • client: end host that initiates connection • server: end host contacted by client Step 2: server receives SYN, replies with SYN+ACK control segment – ACKs received SYN – specifies server receiver initial seq. # Step 3: client receives SYN+ACK, replies with ACK segment (which may contain 1 st data segment) Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 21

TCP 3 -Way Hand-Shake client initiate connection server Question: SYN, s eq=x x+1 =

TCP 3 -Way Hand-Shake client initiate connection server Question: SYN, s eq=x x+1 = ck a , y SYN received q= e s K, b. What initial sequence # should client (and server) use? AC + N SY connection established ACK, s eq=x+ 1, ack (1 st da =y+1 ta seg m Csci 183/183 W/232 – Computer Networks a. What kind of “state” client and server need to maintain? ent) connection established Transport Layer & TCP 22

TCP Connection Setup Example No. Time Source > Destination 1 13. 734375 70. 13.

TCP Connection Setup Example No. Time Source > Destination 1 13. 734375 70. 13. 155. 114 128. 101. 35. 150 Seq=758244755 Len=0 MSS=1260 Proto Src. Port>Dst. Port [Flags] TCP 1414 > 22 [SYN] 2 13. 968750 128. 101. 35. 150 70. 13. 155. 114 TCP 22 > 1414 [SYN, ACK] Seq=3778406755 Ack=758244756 Win=25200 Len=0 MSS=1460 3 13. 968750 70. 13. 155. 114 128. 101. 35. 150 TCP Seq=758244756 Ack=3778406756 Win=16384 Len=0 Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 1414 > 22 [ACK] 23

TCP Connection Setup Example No. Time Source > Destination Proto Src. Port>Dst. Port [Flags]

TCP Connection Setup Example No. Time Source > Destination Proto Src. Port>Dst. Port [Flags] 1 13. 6611233 70. 13. 155. 114 128. 101. 35. 204 TCP 1567 > 80 [SYN] Seq=3724852786 Len=0 MSS=1260 2 13. 890625 128. 101. 35. 204 70. 13. 155. 114 TCP 80> 1567 [SYN, ACK] Seq=484733971 Ack=3724852787 Win=25200 Len=0 MSS=1460 3 13. 890625 70. 13. 155. 114 128. 101. 35. 204 TCP Seq=3724852787 Ack=484733972 Win=17640 Len=0 1567 > 80 [ACK] 4 13. 890625 70. 13. 155. 114 128. 101. 35. 204 TCP Seq=73724852787 Ack=484733972 Win=17640 Len=564 1567 > 80 [PSH, ACK] 5 14. 630860 128. 101. 35. 204 70. 13. 155. 114 TCP 80> 1567 [ACK] Seq=484733972 Ack=3724853351 Win=25200 Len=0 MSS=1460 Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 24

Connection Setup Error Scenarios • Lost (control) packets – What happen if SYN lost?

Connection Setup Error Scenarios • Lost (control) packets – What happen if SYN lost? client vs. server actions – What happen if SYN+ACK lost? client vs. server actions – What happen if ACK lost? client vs. server actions • Duplicate (control) packets – What does server do if duplicate SYN received? – What does client do if duplicate SYN+ACK received? – What does server do if duplicate ACK received? Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 25

Connection Setup Error Scenarios (cont’d) • Importance of (unique) initial seq. no. ? –

Connection Setup Error Scenarios (cont’d) • Importance of (unique) initial seq. no. ? – When receiving SYN, how does server know it’s a new connection request? – When receiving SYN+ACK, how does client know it’s a legitimate, i. e. , a response to its SYN request? • Dealing with old duplicate packets from old connections (or from malicious users) – If not careful: “TCP Hijacking” • How to choose unique initial seq. no. ? – randomly choose a number (and add to last syn# used) • Other security concern: – “SYN Flood” -- denial-of-service attack Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 26

Detecting Half-Open Connections TCP A 1. 2. 3. 4. 5. 6. 7. TCP B

Detecting Half-Open Connections TCP A 1. 2. 3. 4. 5. 6. 7. TCP B (CRASH) CLOSED SYN-SENT <SEQ=400><CTL=SYN> (!!) <SEQ=300><ACK=100><CTL=ACK> SYN-SENT <SEQ=100><CTL=RST> SYN-SENT <SEQ=400><CTL=SYN> Csci 183/183 W/232 – Computer Networks Transport Layer & TCP (send 300, receive 100) ESTABLISHED (? ? ) ESTABLISHED (Abort!!) CLOSED 27

TCP State Diagram: Connection Setup Client CLOSED Server passive OPEN CLOSE delete TCP create

TCP State Diagram: Connection Setup Client CLOSED Server passive OPEN CLOSE delete TCP create TCP CLOSE delete TCP LISTEN SYN RCVD rcv SYN snd SYN ACK rcv SYN snd ACK Csci 183/183 W/232 – Computer Networks SEND snd SYN SENT Rcv SYN, ACK rcv ACK of SYN CLOSE Send FIN active OPEN create TCP Snd SYN Snd ACK ESTAB Transport Layer & TCP 28

TCP: Closing Connection Remember TCP duplex connection! Client wants to close connection: Step 1:

TCP: Closing Connection Remember TCP duplex connection! Client wants to close connection: Step 1: client end system sends TCP FIN control segment to server client closing Step 2: server receives FIN, FIN ACK replies with ACK. half closed Step 3: client receives ACK. server half closed, wait for server to close FIN Server finishes sending data, also ready to close: half closed server closing Step 4: server sends FIN. Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 29

TCP: Closing Connection (cont’d) Step 5: client receives FIN, replies with ACK. connection fully

TCP: Closing Connection (cont’d) Step 5: client receives FIN, replies with ACK. connection fully closed Step 6: server, receives ACK. client closing connection fully closed FIN ACK half closed Well Done! full closed Problem Solved? Csci 183/183 W/232 – Computer Networks server Transport Layer & TCP FIN half closed server closing ACK full closed 30

TCP: Closing Connection (revised) Two Army Problem! Step 5: client receives FIN, replies with

TCP: Closing Connection (revised) Two Army Problem! Step 5: client receives FIN, replies with ACK. client server client closing FIN half closed ACK – Enters “timed wait” - will respond with ACK to received FINs half closed server closing FIN connection fully closed Step 7: client, timer expires, connection fully closed timed wait Step 6: server, receives ACK X ACK FIN timeout full closed Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 31

TCP Connection Tear-Down Example No. Time Source > Destination Proto 80 35. 156250 70.

TCP Connection Tear-Down Example No. Time Source > Destination Proto 80 35. 156250 70. 13. 155. 114 128. 101. 35. 150 TCP Seq=758246388 Ack=3778411633 Win=15920 Len=32 81 35. 156250 70. 13. 155. 114 128. 101. 35. 150 TCP Seq=758246420 Ack=3778411633 Win=15920 Len=0 Src. Port>Dst. Port [Flags] 1414 > 22 [PSH, ACK] 1414 > 22 [FIN, ACK] 82 35. 437500 128. 101. 35. 150 70. 13. 155. 114 TCP 22 > 1414 [ACK] Seq=3778411633 Ack=758246420 Win=25200 Len=0 13. 968750 83 35. 453125 128. 101. 35. 150 70. 13. 155. 114 Seq=3778411633 Ack=758246421 Win=25200 Len=0 84 35. 453125 128. 101. 35. 150 70. 13. 155. 114 Seq=3778411633 Ack=758246421 Win=25200 Len=0 TCP 22 > 1414 [ACK] 13. 968750 TCP 22 > 1414 [FIN, ACK] 13. 968750 85 35. 453125 70. 13. 155. 114 128. 101. 35. 150 TCP Seq=758246421 Ack=3778411634 Win=15920 Len=0 Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 1414 > 22 [ACK] 32

State Diagram: Connection Tear-down CLOSE send FIN Active Close ESTAB CLOSE send FIN WAIT-1

State Diagram: Connection Tear-down CLOSE send FIN Active Close ESTAB CLOSE send FIN WAIT-1 CLOSE WAIT rcv FIN snd ACK FIN WAIT-2 rcv FIN Passive Close send ACK CLOSE snd FIN rcv FIN+ACK snd ACK CLOSING LAST-ACK rcv ACK of FIN rcv FIN snd ACK Csci 183/183 W/232 – Computer Networks TIME WAIT rcv ACK of FIN Timeout=2 min delete TCP Transport Layer & TCP CLOSED 33

TCP Connection Management FSM TCP client lifecycle Csci 183/183 W/232 – Computer Networks Transport

TCP Connection Management FSM TCP client lifecycle Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 34

TCP Connection Management FSM TCP server lifecycle Csci 183/183 W/232 – Computer Networks Transport

TCP Connection Management FSM TCP server lifecycle Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 35

Reliability and Error Recovery • ARQ vs. FEC – automatic retransmission request – forward

Reliability and Error Recovery • ARQ vs. FEC – automatic retransmission request – forward error correction • General ARQ Algorithms – Stop & Wait • Perform issue: low utilization when delay-bw product large – Sliding Window Protocols • Go-Back-N • Selective Repeat • Key design issues: window size vs. size of seq. no. space Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 36

Error Recovery: Stop and Wait • ARQ – Receiver sends acknowledgement (ACK) when it

Error Recovery: Stop and Wait • ARQ – Receiver sends acknowledgement (ACK) when it receives packet – Sender waits for ACK and timeouts if it does not arrive within some time period Csci 183/183 W/232 – Computer Networks Timeout • Simplest ARQ protocol • Send a packet, stop and wait until ACK arrives Sender Receiver Packe t ACK Time Transport Layer & TCP 37

Packe t ACK lost Csci 183/183 W/232 – Computer Networks t Packe t ACK

Packe t ACK lost Csci 183/183 W/232 – Computer Networks t Packe t ACK Packet lost Transport Layer & TCP Timeout ACK Packe Timeout t Timeout Packe Timeout Recovering from Error Packe t K C A Packe t ACK Early timeout DUPLICATE PACKETS!!! 38

Problems with Stop and Wait • How to recognize a duplicate • Performance –

Problems with Stop and Wait • How to recognize a duplicate • Performance – Can only send one packet per round trip Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 39

How to Recognize Resends? • Use sequence numbers – both packets and acks •

How to Recognize Resends? • Use sequence numbers – both packets and acks • Sequence # in packet is finite How big should it be? – For stop and wait? • One bit – won’t send seq #1 until received ACK for seq #0 Csci 183/183 W/232 – Computer Networks Transport Layer & TCP Pkt 0 CK A 0 Pkt 0 ACKP 0 kt 1 ACK 1 40

Problem with Stop & Wait Protocol Sender first packet bit transmitted, t = 0

Problem with Stop & Wait Protocol Sender first packet bit transmitted, t = 0 Receiver data ( L bytes) first packet bit arrives RTT ACK arrives, send next packet, t = RTT + L / R • Can’t keep the pipe full – Utilization is low when bandwidth-delay product (R x RTT)is large! Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 41

Stop & Wait: Performance Analysis Example: 1 Gbps connection, 15 ms end-end prop. delay,

Stop & Wait: Performance Analysis Example: 1 Gbps connection, 15 ms end-end prop. delay, data segment size: 1 KB = 8 Kb – U sender: utilization, i. e. , fraction of time sender busy sending – 1 KB data segment every 30 msec (round trip time) --> 0. 027% x 1 Gbps = 33 k. B/sec throughput over 1 Gbps link Moral of story: network protocol limits use of physical resources! Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 42

How to Keep the Pipe Full? • Send multiple packets without waiting for first

How to Keep the Pipe Full? • Send multiple packets without waiting for first to be acked – Number of pkts in flight = window • Reliable, unordered delivery – Several parallel stop & waits – Send new packet after each ack – Sender keeps list of unack’ed packets; resends after timeout – Receiver same as stop & wait • How large a window is needed? – Suppose 10 Mbps link, 4 ms delay, 500 byte pkts • 1? 10? 20? – Round trip delay * bandwidth = capacity of pipe Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 43

Pipelined (Sliding Window) Protocols Pipelining: sender allows multiple, “in-flight”, yet-tobe-acknowledged data segments – range

Pipelined (Sliding Window) Protocols Pipelining: sender allows multiple, “in-flight”, yet-tobe-acknowledged data segments – range of sequence numbers must be increased – buffering at sender and/or receiver • Two generic forms of pipelined protocols: Go-Back-N and Selective Repeat Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 44

Pipelining: Increased Utilization sender receiver first packet bit transmitted, t = 0 last bit

Pipelining: Increased Utilization sender receiver first packet bit transmitted, t = 0 last bit transmitted, t = L / R RTT first packet bit arrives last packet bit arrives, send ACK last bit of 2 nd packet arrives, send ACK last bit of 3 rd packet arrives, send ACK arrives, send next packet, t = RTT + L / R Increase utilization by a factor of 3! Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 45

Sliding Window • Reliable, ordered delivery • Receiver has to hold onto a packet

Sliding Window • Reliable, ordered delivery • Receiver has to hold onto a packet until all prior packets have arrived – Why might this be difficult for just parallel stop & wait? – Sender must prevent buffer overflow at receiver • Circular buffer at sender and receiver – Packets in transit buffer size – Advance when sender and receiver agree packets at beginning have been received Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 46

Sender/Receiver State Sender Max ACK received Receiver Next expected Next seqnum … … Sender

Sender/Receiver State Sender Max ACK received Receiver Next expected Next seqnum … … Sender window Receiver window Sent & Acked Sent Not Acked OK to Send Not Usable Csci 183/183 W/232 – Computer Networks Max acceptable Received & Acked Acceptable Packet Not Usable Transport Layer & TCP 47

Window Sliding – Common Case • On reception of new ACK (i. e. ACK

Window Sliding – Common Case • On reception of new ACK (i. e. ACK for something that was not acked earlier) – Increase sequence of max ACK received – Send next packet • On reception of new in-order data packet (next expected) – Hand packet to application – Send cumulative ACK – acknowledges reception of all packets up to sequence number – Increase sequence of max acceptable packet Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 48

Loss Recovery • Go-Back-N recovery – – Set timer upon transmission of each packet

Loss Recovery • Go-Back-N recovery – – Set timer upon transmission of each packet Cumulative ACK Retransmit all unacknowledged packets No receiver buffering, out-of-order packets are discarded – – Sender keeps a timer for each packet Selective ACK Receiver must buffer all out-of-order packets When timeout, retransmit only one packet • Selective Repeat • Performance during loss recovery – No longer have an entire window in transit – Can have much more clever loss recovery Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 49

Go-Back-N in Action Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 50

Go-Back-N in Action Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 50

Selective Repeat • Receiver individually acknowledges all correctly received pkts – Buffers packets, as

Selective Repeat • Receiver individually acknowledges all correctly received pkts – Buffers packets, as needed, for eventual in-order delivery to upper layer • Sender only resends packets for which ACK not received – Sender timer for each un. ACKed packet • Sender window – N consecutive seq #’s – Again limits seq #s of sent, un. ACKed packets Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 51

Selective Repeat: Sender, Receiver Windows Csci 183/183 W/232 – Computer Networks Transport Layer &

Selective Repeat: Sender, Receiver Windows Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 52

Sequence Numbers • How large does size of sequence number space need to be?

Sequence Numbers • How large does size of sequence number space need to be? – Must be able to detect wrap-around – Depends on sender/receiver window size • E. g. – size of seq. no. space = 8, send win=recv win=7 – If pkts 0. . 6 are sent succesfully and all acks lost • Receiver expects 7, 0. . 5, sender retransmits old 0. . 6!!! • size of sequence no. space must be send window + recv window Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 53

Sequence Numbers in TCP • TCP regards data as a “byte-stream” – each byte

Sequence Numbers in TCP • TCP regards data as a “byte-stream” – each byte in byte stream is numbered. • 32 bit value, wraps around • initial values selected at start up time • TCP breaks up byte stream in packets. – Packet size is limited to the Maximum Segment Size (MSS) • Each packet has a sequence number – seq. no of 1 st byte indicates where it fits in the byte stream • TCP connection is duplex – data in each direction has its own sequence numbers 13450 14950 packet 8 Csci 183/183 W/232 – Computer Networks 16050 packet 9 Transport Layer & TCP 17550 packet 10 54

TCP Seq. #’s and ACKs Seq. #’s: byte stream “number”of first byte in segment’s

TCP Seq. #’s and ACKs Seq. #’s: byte stream “number”of first byte in segment’s data ACKs: seq # of next byte expected from other side Host B Host A User Seq=4 2, ACK =79, d types ata = ‘ C’ ‘C’ host ACKs receipt of C’ ‘ = ‘C’, echoes data , 3 4 = ACK back ‘C’ =79, Seq host ACKs receipt of echoed ‘C’ Seq=4 3, ACK =80 red: A-to-B green: B-to-A time simple telnet scenario Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 55

TCP Reliable Data Transfer • TCP creates reliable data transfer service on top of

TCP Reliable Data Transfer • TCP creates reliable data transfer service on top of IP’s unreliable service • Pipelined segments • Cumulative ACKs • TCP uses single retransmission timer Csci 183/183 W/232 – Computer Networks • Retransmissions are triggered by: – timeout events – duplicate acks • Initially consider simplified TCP sender: – ignore duplicate acks – ignore flow control, congestion control Transport Layer & TCP 56

TCP = Go-Back-N Variant • Sliding window with cumulative acks – Receiver can only

TCP = Go-Back-N Variant • Sliding window with cumulative acks – Receiver can only return a single “ack” sequence number to the sender. – Acknowledges all bytes with a lower sequence number – Starting point for retransmission – Duplicate acks sent when out-of-order packet received • But: sender only retransmits a single packet. – Reason? ? ? • Only one that it knows is lost • Network is congested shouldn’t overload it • Error control is based on byte sequences, not packets. – Retransmitted packet can be different from the original lost packet – Why? Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 57

TCP Sender Events: data rcvd from app: • Create segment with seq # •

TCP Sender Events: data rcvd from app: • Create segment with seq # • seq # is byte-stream number of first data byte in segment • start timer if not already running (think of timer as for oldest unacked segment) • expiration interval: Time. Out. Interval Csci 183/183 W/232 – Computer Networks timeout: • retransmit segment that caused timeout • restart timer ACK received: • If acknowledges previously un. ACKed segments – update what is known to be ACKed – start timer if there are outstanding segments Transport Layer & TCP 58

TCP ACK generation [RFC 1122, RFC 2581] Event at Receiver TCP Receiver Action Arrival

TCP ACK generation [RFC 1122, RFC 2581] Event at Receiver TCP Receiver Action Arrival of in-order segment with expected seq #. All data up to expected seq # already ACKed Delayed ACK. Wait up to 500 ms for next segment. If no next segment, send ACK Arrival of in-order segment with expected seq #. One other segment has ACK pending Immediately send single cumulative ACK, ACKing both in-order segments Arrival of out-of-order segment higher-than-expect seq. #. Gap detected Immediately send duplicate ACK, indicating seq. # of next expected byte Arrival of segment that partially or completely fills gap Immediate send ACK, provided that segment starts at lower end of gap Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 59

TCP Flow Control • receive side of TCP connection has a receive buffer: •

TCP Flow Control • receive side of TCP connection has a receive buffer: • app process may be slow at reading from buffer Csci 183/183 W/232 – Computer Networks flow control sender won’t overflow receiver’s buffer by transmitting too much, too fast • speed-matching service: matching the send rate to the receiving app’s drain rate Transport Layer & TCP 60

TCP Flow Control: How It Works (Suppose TCP receiver discards out-of-order segments) • spare

TCP Flow Control: How It Works (Suppose TCP receiver discards out-of-order segments) • spare room in buffer • Rcvr advertises spare room by including value of Rcv. Window in segments • Sender limits un. ACKed data to Rcv. Window – guarantees receive buffer doesn’t overflow = Rcv. Window (dynamically changes) = Rcv. Buffer-[Last. Byte. Rcvd Last. Byte. Read] Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 61

TCP Segment Structure 32 bits URG: urgent data (generally not used) ACK: ACK #

TCP Segment Structure 32 bits URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now (generally not used) source port # dest port # sequence number acknowledgement number head not UA P R S F len used RST, SYN, FIN: connection estab (setup, teardown commands) Internet checksum (as in UDP) Csci 183/183 W/232 – Computer Networks checksum counting by bytes of data (not segments!) rcvr window size ptr urgent data Options (variable length) # bytes rcvr willing to accept application data (variable length) Transport Layer & TCP 62

Triggering Transmission • How does TCP decide to transmit a segment? – MSS (Maximum

Triggering Transmission • How does TCP decide to transmit a segment? – MSS (Maximum segment size) • Set to size of the largest segment TCP can send without local IP fragmentation (MTU of directly connected) – Sending process explicitly asked to do (Push to flush) – Firing timer • Silly Window Syndrome – Flow control needs to be maintained – Sender can transmit full segment (MSS) when Acked by receiver Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 63

Silly Window Syndrome (cont’d) – Window currently closed from receiver – ACK opens MSS/2

Silly Window Syndrome (cont’d) – Window currently closed from receiver – ACK opens MSS/2 bytes – Should sender transmit MSS/2? • Original TCP implementation silent • Early implementation of TCP decided to go ahead • Sender can not know when the window will open for full MSS – If sender is aggressive, sending available window size • results Silly window syndrome • small segment size remains indefinitely – Hence a problem when either sender transmits a small segment or receiver opens window a small amount Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 64

Triggering Transmission (cont’d) – Receiver may delay ACKs, but how long? – Ultimate solution

Triggering Transmission (cont’d) – Receiver may delay ACKs, but how long? – Ultimate solution lies with sender: • When does the TCP sender decide to transmit a segment? • Nagle’s Algorithm: – Waiting too long hurt interactive applications (Telnet) – Without waiting, risk of sending a bunch of tiny packets (silly window syndrome) – Wait till timer expires: • Self clocking: As long as TCP has any data in flight, sender receives an ACK which can be used to trigger transmission • If no data in flight, immediately send the segment (setting TCP_No. DELAY option) Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 65

TCP Round Trip Time and Timeout Q: how to set TCP timeout value? •

TCP Round Trip Time and Timeout Q: how to set TCP timeout value? • longer than RTT – but RTT varies • too short: premature timeout – unnecessary retransmissions • too long: slow reaction to segment loss Csci 183/183 W/232 – Computer Networks Q: how to estimate RTT? • Sample. RTT: measured time from segment transmission until ACK receipt – ignore retransmissions, why? • Sample. RTT will vary, want estimated RTT “smoother” – average several recent measurements, not just current Sample. RTT Transport Layer & TCP 66

Round-trip Time Estimation • Importance of accurate RTT estimators: – Low RTT estimate •

Round-trip Time Estimation • Importance of accurate RTT estimators: – Low RTT estimate • unneeded retransmissions – High RTT estimate • poor throughput • RTT estimator must adapt to change in RTT – But not too fast, or too slow! • Spurious timeouts – “Conservation of packets” principle – never more than a window worth of packets in flight Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 67

Adaptive Retransmission (Original Algorithm) • Measure Sample. RTT for each segment/ ACK pair •

Adaptive Retransmission (Original Algorithm) • Measure Sample. RTT for each segment/ ACK pair • Compute weighted running average of RTT – Est. RTT = a x Estimated. RTT + (1 -a) x Sample. RTT - a between 0. 8 and 0. 9 ( to smooth Estimated RTT) - Small a indicates temp. fluctuation, a large value more stable, may not be quick to adapt to real changes • Set timeout based on Est. RTT – Time. Out = 2 x Est. RTT Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 68

Retransmission Ambiguity Sender Receiver Sample. R TT inal Retr miss ansm Receiver Orig trans

Retransmission Ambiguity Sender Receiver Sample. R TT inal Retr miss ansm Receiver Orig trans ion issio n ACK Sample. R TT Orig Sender inal t rans miss ion ACK Retr ansm issio n • ACK is for Original transmission but was for retransmission => Sample RTT is too large • ACK is for retransmission but was for original => Sample RTT too small Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 69

Karn/Partridge Algorithm • Solution: • Do not sample RTT when retransmitting – only measures

Karn/Partridge Algorithm • Solution: • Do not sample RTT when retransmitting – only measures sample RTT for segments sent once • Double timeout for each retransmission – Next timeout to be twice the last timeout, rather than basing it on the last Estimated RTT • Karn and Patridge proposal is exponential backoff – Congestion is most likely cause of lost segments – TCP sources should not react too aggressively to a timeout – More timeouts mean more cautious the source should become (congestion problem) Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 70

Jacobson/ Karels Algorithm • Original computation for RTT did not take the variance of

Jacobson/ Karels Algorithm • Original computation for RTT did not take the variance of sample RTTs into account – If variation among samples is small, Estimated RTT can be better used without increasing the estimate twice – A large variance in the samples mean Time out values should not be too tightly coupled to the Estimated RTT • New Calculations for average RTT – Diff = Sample. RTT - Est. RTT – Est. RTT = Est. RTT + ( x Diff) – Dev = Dev + ( |Diff| - Dev) • where is a fraction between 0 and 1 • Consider variance when setting timeout value – Time. Out = m x Est. RTT + f x Dev • where m = 1 and f = 4 Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 71

TCP Round Trip Time Estimation Estimated. RTT = (1 - )*Estimated. RTT + *Sample.

TCP Round Trip Time Estimation Estimated. RTT = (1 - )*Estimated. RTT + *Sample. RTT • Exponential weighted moving average • influence of past sample decreases exponentially fast • typical value: = 0. 125 Setting the timeout interval • Estimted RTT plus “safety margin” – large variation in Estimated. RTT -> larger safety margin • “safty margin”: accommodate variations in estimated. RTT Dev. RTT = (1 - )*Dev. RTT + *|Sample. RTT-Estimated. RTT| (typically, = 0. 25) Timeout. Interval = Estimated. RTT + 4*Dev. RTT Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 72

Example RTT Estimation: Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 73

Example RTT Estimation: Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 73

Timestamp Extension • Used to improve timeout mechanism by more accurate measurement of RTT

Timestamp Extension • Used to improve timeout mechanism by more accurate measurement of RTT • When sending a packet, insert current time into option – 4 bytes for time, 4 bytes for echo a received timestamp • Receiver echoes timestamp in ACK – Actually will echo whatever is in timestamp • Removes retransmission ambiguity – Can get RTT sample on any packet Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 74

Timer Granularity • Many TCP implementations set RTO (Retransmission Timeout) in multiples of 200,

Timer Granularity • Many TCP implementations set RTO (Retransmission Timeout) in multiples of 200, 500, 1000 ms • Why? – Avoid spurious timeouts – RTTs can vary quickly due to cross traffic – Make timers interrupts efficient • What happens for the first couple of packets? – Pick a very conservative value (seconds) Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 75

Important Lessons • TCP state diagram setup/teardown • TCP timeout calculation how is RTT

Important Lessons • TCP state diagram setup/teardown • TCP timeout calculation how is RTT estimated • Modern TCP loss recovery – Why are timeouts bad? – How to avoid them? e. g. fast retransmit Csci 183/183 W/232 – Computer Networks Transport Layer & TCP 76