Course on Computer Communication and Networks Lecture 5

  • Slides: 46
Download presentation
Course on Computer Communication and Networks Lecture 5 Chapter 3; Transport Layer, Part B

Course on Computer Communication and Networks Lecture 5 Chapter 3; Transport Layer, Part B EDA 344/DIT 420, CTH/GU Based on the book Computer Networking: A Top Down Approach, Jim Kurose, Keith Ross, Addison-Wesley. Marina Papatriantafilou – Transport layer part 2 1

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles of reliable data transfer connection-oriented transport: TCP – reliable transfer • Acknowledgements • Retransmissions • Connection management • Flow control and buffer space – Congestion control • Principles • TCP congestion control Marina Papatriantafilou – Transport layer part 2 3 b-2

TCP: Overview RFCs: 793, 1122, 1323, 2018, 5681 v full duplex data: • point-to-point:

TCP: Overview RFCs: 793, 1122, 1323, 2018, 5681 v full duplex data: • point-to-point: – one sender, one receiver • reliable, in-order byte steam: • pipelined: – TCP congestion and flow control set window size § bi-directional data flow in same connection § MSS: maximum segment size v connection-oriented: § handshaking (exchange of control msgs) inits sender & receiver state before data exchange v flow control: § sender will not overwhelm receiver v congestion control: § sender will not flood network (but still try to maximize throughput) Marina Papatriantafilou – Transport layer part 2 3 -3

TCP segment structure 32 bits URG: urgent data (generally not used) ACK: ACK #

TCP segment structure 32 bits URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now (generally not used) RST, SYN, FIN: connection estab (setup, teardown commands) Internet checksum (as in UDP) source port # dest port # sequence number acknowledgement number head not UAP R S F len used checksum receive window Urg data pointer options (variable length) counting by bytes of data (not segments!) # bytes rcvr willing to buffer (flow control) application data (variable length) Marina Papatriantafilou – Transport layer part 2 3 -4

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles of reliable data transfer connection-oriented transport: TCP – reliable transfer • Acknowledgements • Retransmissions • Connection management • Flow control and buffer space – Congestion control • Principles • TCP congestion control Marina Papatriantafilou – Transport layer part 2 3 b-5

TCP seq. numbers, ACKs outgoing segment from sender source port # sequence numbers: sequence

TCP seq. numbers, ACKs outgoing segment from sender source port # sequence numbers: sequence number acknowledgement number rwnd checksum window size –“number” of first byte in segment’s data N acknowledgements: –seq # of next byte expected from other side –cumulative ACK dest port # sender sequence number space sent ACKed sent, not- usable not yet ACKed but not usable (“in-flight”) yet sent incoming segment to sender source port # dest port # sequence number acknowledgement number rwnd A checksum Marina Papatriantafilou – Transport layer part 2 3 -6

TCP seq. numbers, ACKs Always ack next in-order expected byte Host B Host A

TCP seq. numbers, ACKs Always ack next in-order expected byte Host B Host A User types ‘C’ Seq=42, ACK=79, data = ‘C’ Seq=79, ACK=43, data = ‘C’ host ACKs receipt of echoed ‘C’ host ACKs receipt of ‘C’, echoes back ‘C’ Seq=43, ACK=80 Simple example scenario Based on telnet msg exchange Marina Papatriantafilou – Transport layer part 2 3 -7

TCP: cumulative Ack - retransmission scenarios Host B Host A Seq=92, 8 bytes of

TCP: cumulative Ack - retransmission scenarios Host B Host A Seq=92, 8 bytes of data Seq=100, 20 bytes of data X ACK=100 timeout Send. Base=92 ACK=100 ACK=120 Seq=120, 15 bytes of data Send. Base=100 Seq=92, 8 bytes of data Send. Base=120 ACK=120 Send. Base=120 Cumulative ACK Marina Papatriantafilou – Transport layer part 2 (Premature) timeout 3 -8

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles of reliable data transfer connection-oriented transport: TCP – reliable transfer • Acknowledgements • Retransmissions • Connection management • Flow control and buffer space – Congestion control • Principles • TCP congestion control Marina Papatriantafilou – Transport layer part 2 3 b-11

Q: how to set TCP timeout value? v longer than end-to -end RTT §

Q: how to set TCP timeout value? v longer than end-to -end RTT § but that varies!!! v too short timeout: premature, unnecessary retransmissions v too long: slow reaction to loss application segment transport network link physical application transport network link physical Marina Papatriantafilou – Transport layer part 2 network link physical application transport network link physical application segment transport network link physical 12

TCP round trip time, timeout estimation Estimated. RTT = (1 - )*Estimated. RTT +

TCP round trip time, timeout estimation Estimated. RTT = (1 - )*Estimated. RTT + *Sample. RTT v exponential weighted moving average: influence of past sample decreases exponentially fast RTT: gaia. cs. umass. edu to fantasia. eurecom. fr typical value: = 0. 125 RTT (milliseconds) v sample. RTT Estimated. R Dev. RTT = (1 - )*Dev. RTT + *|Sample. RTT-Estimated. RTT| time (seconds) (typically, = 0. 25) Timeout. Interval = Estimated. RTT + 4*Dev. RTT Marina Papatriantafilou – Transport layer part 2 “safety margin” 3 -13

TCP fast retransmit (RFC 5681) v time-out can be long: Host B Host A

TCP fast retransmit (RFC 5681) v time-out can be long: Host B Host A § long delay before resending lost packet v IMPROVEMENT: detect lost segments via duplicate ACKs Seq=92, 8 bytes of data Seq=100, 20 bytes of data X TCP fast retransmit • timeout if sender receives 3 duplicate ACKs for same data ACK=100 Seq=100, 20 bytes of data resend unacked segment with smallest Implicit NAK! seq # Q: Why need at § likely that unacked least 3? segment lost, so don’t Marina Papatriantafilou – Transport layer part 2 3 b-14

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles of reliable data transfer connection-oriented transport: TCP – reliable transfer • Acknowledgements • Retransmissions • Connection management • Flow control and buffer space – Congestion control • Principles • TCP congestion control Marina Papatriantafilou – Transport layer part 2 3 b-15

Connection Management before exchanging data, sender/receiver “handshake”: • agree to establish connection + connection

Connection Management before exchanging data, sender/receiver “handshake”: • agree to establish connection + connection parameters application connection state: ESTAB connection variables: seq # client-to-server-to-client rcv. Buffer size at server, client network Socket client. Socket = new. Socket("hostname", "port number"); Marina Papatriantafilou – Transport layer part 2 application connection state: ESTAB connection Variables: seq # client-to-server-to-client rcv. Buffer size at server, client network Socket connection. Socket = welcome. Socket. accept(); 3 -16

Setting up a connection: TCP 3 -way handshake client state server state LISTEN choose

Setting up a connection: TCP 3 -way handshake client state server state LISTEN choose init seq num, x send TCP SYN msg SYNSENT received SYN/ACK(x) server is live; ESTAB send ACK for SYN/ACK; this segment may contain client-to-server data SYN=1, Seq=x SYN=1, Seq=y ACK=1; ACKnum=x+1 choose init seq num, y send TCP SYN/ACK SYN RCVD msg, acking SYN Reserve buffer ACK=1, ACKnum=y+1 Marina Papatriantafilou – Transport layer part 2 received ACK(y) indicates client is live Transport Layer ESTAB 3 -17

TCP: closing a connection client state server state ESTAB client. Socket. close() FIN_WAIT_1 FIN_WAIT_2

TCP: closing a connection client state server state ESTAB client. Socket. close() FIN_WAIT_1 FIN_WAIT_2 can no longer send but can receive data FIN=1, seq=x CLOSE_WAIT ACK=1; ACKnum=x+1 wait for server close LAST_ACK FIN=1, seq=y can no longer send data TIME_WAIT timed wait (typically 30 s) can still send data ACK=1; ACKnum=y+1 CLOSED simultaneous FINs can be handled RST: alternative way to close connection immediately, when error occurs Marina Papatriantafilou – Transport layer part 2 3 -18

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles of reliable data transfer connection-oriented transport: TCP – reliable transfer • Acknowledgements • Retransmissions • Connection management • Flow control and buffer space – Congestion control • Principles • TCP congestion control Marina Papatriantafilou – Transport layer part 2 3 b-20

TCP flow control application might remove data from TCP socket buffers …. … slower

TCP flow control application might remove data from TCP socket buffers …. … slower than TCP is delivering (i. e. slower than sender is sending) application process application TCP socket receiver buffers TCP code IP code flow control receiver controls sender, so sender won’t overflow receiver’s buffer by transmitting too much, too fast Marina Papatriantafilou – Transport layer part 2 OS from sender receiver protocol stack 3 -21

TCP flow control • receiver “advertises” free buffer space through rwnd value in header

TCP flow control • receiver “advertises” free buffer space through rwnd value in header – Rcv. Buffer size set via socket options (typical default 4 Kbytes) – OS can autoadjust Rcv. Buffer • sender limits unacked (“inflight”) data to receiver’s rwnd value to application process Rcv. Buffer rwnd buffered data free buffer space TCP segment payloads receiver-side buffering – s. t. receiver’s buffer will not overflow source port # To sender dest port # sequence number acknowledgement number A rwnd checksum Marina Papatriantafilou – Transport layer part 2 3 -22

Q: Is TCP stateful or stateless? Marina Papatriantafilou – Transport layer part 2 23

Q: Is TCP stateful or stateless? Marina Papatriantafilou – Transport layer part 2 23

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles of reliable data transfer connection-oriented transport: TCP – reliable transfer • Acknowledgements • Retransmissions • Connection management • Flow control and buffer space – Congestion control • Principles • TCP congestion control Marina Papatriantafilou – Transport layer part 2 3 b-25

Principles of congestion control congestion: • informally: “many sources sending too much data too

Principles of congestion control congestion: • informally: “many sources sending too much data too fast for network to handle” • Manifestations? – lost packets (buffer overflow at routers) – long delays (queueing in router buffers) Marina Papatriantafilou – Transport layer part 2 26

Distinction between flow control and congestion control Fig. A. Tanenbaum Computer Networks Need for

Distinction between flow control and congestion control Fig. A. Tanenbaum Computer Networks Need for flow control Marina Papatriantafilou – Transport layer part 2 Need for congestion control 27

Causes/costs of congestion original data: lin throughput: lout Host A output link capacity: R

Causes/costs of congestion original data: lin throughput: lout Host A output link capacity: R link buffers Host B v Recall queueing behaviour + losses v Losses => retransmissions => even higher load… lout delay R/2 lin v Ideal per-connection throughput: R/2 (if 2 connections) Marina Papatriantafilou – Transport layer part 2 R/2 v reality 28

Approaches towards congestion control k t’s e n ter ols n I n nt

Approaches towards congestion control k t’s e n ter ols n I n nt i protoc e s re ayer p t No ork l w net end-end congestion control: network-assisted congestion control: o ta h c a r app T y b en CP v no explicit feedback from network v congestion inferred from end-system observed loss, delay Marina Papatriantafilou – Transport layer part 2 v routers collaborate for optimal rates + provide feedback to end-systems eg. § a single bit indicating congestion § explicit rate for sender to send at 3 -32

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles of reliable data transfer connection-oriented transport: TCP – reliable transfer • Acknowledgements • Retransmissions • Connection management • Flow control and buffer space – Congestion control • Principles • TCP congestion control Marina Papatriantafilou – Transport layer part 2 3 b-33

TCP congestion control: additive increase multiplicative decrease end-end control (no network assistance), sender limits

TCP congestion control: additive increase multiplicative decrease end-end control (no network assistance), sender limits transmission How does sender perceive congestion? § loss = timeout or 3 duplicate acks § TCP sender reduces rate (Congestion Window) then v rate ~ ~ cwnd RTT bytes/sec § Additive Increase: increase cwnd by 1 MSS every RTT until loss detected § Multiplicative Decrease: cut cwnd in half after loss AIMD saw tooth behavior: probing for bandwidth cwnd: TCP sender congestion window size § To start with: slow start additively increase window size … …. until loss occurs (then cut window in half) time Marina Papatriantafilou – Transport layer part 2 3 -34

TCP Slow Start § initially cwnd = 1 MSS § double cwnd every ack

TCP Slow Start § initially cwnd = 1 MSS § double cwnd every ack of previous “batch” § done by incrementing cwnd for every ACK received v summary: initial rate is slow but ramps up exponentially fast Host B Host A RTT v when connection begins, increase rate exponentially until first loss event: one segm ent two segm ents four segm ents time vthen, saw-tooth Marina Papatriantafilou – Transport layer part 2 3 -35

TCP cwnd: from exponential to linear growth + reacting to loss Reno: loss indicated

TCP cwnd: from exponential to linear growth + reacting to loss Reno: loss indicated by timeout or 3 duplicate ACKs: cwnd is cut in half; then grows linearly Implementation: v variable ssthresh (slow start threshold) v on loss event, ssthresh = ½*cwnd Marina Papatriantafilou – Transport layer part 2 Non-optimized: loss indicated by timeout: cwnd set to 1 MSS; then window slow start to threshold, then grows linearly 3 -36

Q: How many windows does a TCP’s sender maintain? Fig. A. Tanenbaum Computer Networks

Q: How many windows does a TCP’s sender maintain? Fig. A. Tanenbaum Computer Networks Need for flow control Marina Papatriantafilou – Transport layer part 2 Need for congestion control 39

TCP combined flow-ctrl, congestion ctrl windows sender sequence number space Min{cwnd, rwnd} last byte

TCP combined flow-ctrl, congestion ctrl windows sender sequence number space Min{cwnd, rwnd} last byte ACKed last byte sent, notsent yet ACKed (“in-flight”) TCP sending rate: v send min {cwnd, rwnd} bytes, wait for ACKS, then send more sender limits transmission: Last. Byte. Sent< Min{cwnd, rwnd} Last. Byte. Acked v cwnd is dynamic, function of perceived network congestion, v rwnd dymanically limited by receiver’s buffer space Marina Papatriantafilou – Transport layer part 2 40

TCP Fairness fairness goal: if K TCP sessions share same bottleneck link of bandwidth

TCP Fairness fairness goal: if K TCP sessions share same bottleneck link of bandwidth R, each should have average rate of R/K TCP connection 1 TCP connection 2 bottleneck router capacity R Marina Papatriantafilou – Transport layer part 2 Transport Layer 3 -41

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles

Roadmap Transport Layer • • • transport layer services multiplexing/demultiplexing connectionless transport: UDP principles of reliable data transfer connection-oriented transport: TCP – reliable transfer • Acknowledgements • Retransmissions • Connection management • Flow control and buffer space – Congestion control • Principles • TCP congestion control Marina Papatriantafilou – Transport layer part 2 3 b-42

Chapter 3: summary v principles behind transport layer services: § Addressing § reliable data

Chapter 3: summary v principles behind transport layer services: § Addressing § reliable data transfer § flow control § congestion control v instantiation, implementation in the Internet next: • leaving the network “edge” (application, transport layers) • into the network “core” § UDP § TCP Marina Papatriantafilou – Transport layer part 2 3 -43

Some review questions on this part • Describe TCP’s flow control • Why does

Some review questions on this part • Describe TCP’s flow control • Why does TCp do fast retransmit upon a 3 rd ack and not a 2 nd? • Describe TCP’s congestion control: principle, method for detection of congestion, reaction. • Can a TCP’s session sending rate increase indefinitely? • Why does TCP need connection management? • Why does TCP use handshaking in the start and the end of connection? • Can an application have reliable data transfer if it uses UDP? How or why not? Marina Papatriantafilou – Transport layer part 2 3 b-44

Reading instructions chapter 3 • Kurose. Ross book Careful Quick 3. 1, 3. 2,

Reading instructions chapter 3 • Kurose. Ross book Careful Quick 3. 1, 3. 2, 3. 4 -3. 7 3. 3 • Other resources (further study) – Eddie Kohler, Mark Handley, and Sally Floyd. 2006. Designing DCCP: congestion control without reliability. SIGCOMM Comput. Commun. Rev. 36, 4 (August 2006), 27 -38. DOI=10. 1145/1151659. 1159918 http: //doi. acm. org/10. 1145/1151659. 1159918 – http: //research. microsoft. com/apps/video/default. aspx? id=1 04005 – Exercise/throughput analysis TCP in following slides Marina Papatriantafilou – Transport layer part 2 3 -45

Extra slides, for further study Marina Papatriantafilou – Transport layer part 2 3: Transport

Extra slides, for further study Marina Papatriantafilou – Transport layer part 2 3: Transport Layer 3 b-46

TCP throughput • avg. TCP throughput as function of window size, RTT? – ignore

TCP throughput • avg. TCP throughput as function of window size, RTT? – ignore slow start, assume always data to send • W: window size (measured in bytes) where loss occurs – avg. window size (# in-flight bytes) is ¾ W – avg. trhoughput is 3/4 W per RTT 3 W bytes/sec avg TCP trhoughput = 4 RTT W W/2 Marina Papatriantafilou – Transport layer part 2 Transport Layer 3 -47

TCP Futures: TCP over “long, fat pipes” • example: 1500 byte segments, 100 ms

TCP Futures: TCP over “long, fat pipes” • example: 1500 byte segments, 100 ms RTT, want 10 Gbps throughput • requires W = 83, 333 in-flight segments • throughput in terms of segment loss probability, L [Mathis 1997]: 1. 22. MSS TCP throughput = RTT L ➜ to achieve 10 Gbps throughput, need a loss rate of L = 2·10 -10 – a very small loss rate! • new versions of TCP for high-speed Marina Papatriantafilou – Transport layer part 2 Transport Layer 3 -48

Why is TCP fair? two competing sessions: v additive increase gives slope of 1,

Why is TCP fair? two competing sessions: v additive increase gives slope of 1, as throughout increases v multiplicative decreases throughput proportionally Connection 2 throughput R equal bandwidth share loss: decrease window by factor of 2 congestion avoidance: additive increase Connection 1 throughput Marina Papatriantafilou – Transport layer part 2 R 3 -49

Fairness (more) Fairness, parallel TCP Fairness and UDP connections v multimedia apps often v

Fairness (more) Fairness, parallel TCP Fairness and UDP connections v multimedia apps often v application can open do not use TCP multiple parallel connections between two § do not want rate hosts throttled by congestion control v web browsers do this v e. g. , link of rate R with 9 v instead use UDP: existing connections: § send audio/video at constant rate, tolerate packet loss Marina Papatriantafilou – Transport layer part 2 § new app asks for 1 TCP, gets rate R/10 § new app asks for 11 TCPs, gets R/2 3 -50

TCP delay modeling (slow start – related) Q: How long does it take to

TCP delay modeling (slow start – related) Q: How long does it take to Notation, assumptions: receive an object from a Web • Assume one link between client and server of rate R server after sending a • Assume: fixed congestion request? • TCP connection establishment • data transfer delay • • Marina Papatriantafilou – Transport layer part 2 window, W segments S: MSS (bits) O: object size (bits) no retransmissions (no loss, no corruption) Receiver has unbounded buffer 3: Transport Layer 3 b-51

TCP delay Modeling: simplified, fixed window K: = O/WS Case 1: WS/R > RTT

TCP delay Modeling: simplified, fixed window K: = O/WS Case 1: WS/R > RTT + S/R: ACK for first segment in window returns before window’s worth of data nsent delay = 2 RTT + O/R Marina Papatriantafilou – Transport layer part 2 Case 2: WS/R < RTT + S/R: wait for ACK after sending window’s worth of data sent delay = 2 RTT + O/R + (K-1)[S/R + RTT - WS/R] 3: Transport Layer 3 b-52

TCP Delay Modeling: Slow Start Delay components: • 2 RTT for connection estab and

TCP Delay Modeling: Slow Start Delay components: • 2 RTT for connection estab and request • O/R to transmit object • time server idles due to slow start Server idles: P = min{K-1, Q} times where - Q = #times server stalls until cong. window is larger than a “full-utilization” window (if the object were of unbounded size). Example: • O/S = 15 segments - K = #(incremental-sized) • K = 4 windows congestion-windows that • Q=2 “cover” the object. • Server idles P = min{K-1, Q} = 2 times 3: Transport Layer Marina Papatriantafilou – Transport layer part 2 3 b-53

TCP Delay Modeling (slow start - cont) Marina Papatriantafilou – Transport layer part 2

TCP Delay Modeling (slow start - cont) Marina Papatriantafilou – Transport layer part 2 3: Transport Layer 3 b-54

TCP Delay Modeling Recall K = number of windows that cover object How do

TCP Delay Modeling Recall K = number of windows that cover object How do we calculate K ? Calculation of Q, number of idles for infinite-size object, is similar. Marina Papatriantafilou – Transport layer part 2 3 b-55