TCPIP Jennifer Rexford Advanced Computer Networks http www

  • Slides: 50
Download presentation
TCP/IP Jennifer Rexford Advanced Computer Networks http: //www. cs. princeton. edu/courses/archive/fall 08/cos 561/ Tuesdays/Thursdays

TCP/IP Jennifer Rexford Advanced Computer Networks http: //www. cs. princeton. edu/courses/archive/fall 08/cos 561/ Tuesdays/Thursdays 1: 30 pm-2: 50 pm

Goals of Today’s Class • Cerf/Kahn paper – Overview and discussion – Separation of

Goals of Today’s Class • Cerf/Kahn paper – Overview and discussion – Separation of IP from TCP • Brief overview of IP header • Transport protocols – Demultiplexing and error detection – Transmission Control Protocol • TCP congestion control, if time allows

“A Protocol for Packet Network Intercommunication” (IEEE Trans. on Communications, May 1974) Vint Cerf

“A Protocol for Packet Network Intercommunication” (IEEE Trans. on Communications, May 1974) Vint Cerf and Bob Kahn Written when Vint Cerf was an assistant professor at Stanford, and Bob Kahn was working at ARPA.

Life in the Early 1970 s • Multiple unconnected networks – ARPAnet – Data-over-cable

Life in the Early 1970 s • Multiple unconnected networks – ARPAnet – Data-over-cable – Packet satellite (Aloha) – Packet radio ARPAnet satellite net

Differences Across Packet-Switched Networks • • • Addressing Maximum packet size Timing for handling

Differences Across Packet-Switched Networks • • • Addressing Maximum packet size Timing for handling success/failure of delivery Handling of lost or corrupted data Routing, fault detection, status information, … ARPAnet satellite net

Where to Handle Heterogeneity? • • Application process? End host? Packet switches? Someplace else?

Where to Handle Heterogeneity? • • Application process? End host? Packet switches? Someplace else? • Compatible process and host conventions – Obviate the need to support all combinations • Retain the unique features of each network – Avoid changing the local network components • Introduce the notion of a gateway

Gateways Between Different Kinds of Networks Internetwork layer • Internetwork appears as a single,

Gateways Between Different Kinds of Networks Internetwork layer • Internetwork appears as a single, uniform entity • Despite the heterogeneity of the local networks • Network of networks Gateway • “Embed internetwork packets in local packet format or extract them” • Route (at internetwork level) to next gateway ARPAnet satellite net

Internetwork Packet Format internetwork header source dest. seq. byte flag local header address #

Internetwork Packet Format internetwork header source dest. seq. byte flag local header address # count field text checksum • Internetwork header in standard format – Interpreted by the gateways and end hosts • Source and destination addresses – Uniformly and uniquely identify every end point • Ensure proper sequencing of the data – Include a sequence number and byte count • Enable detection of corrupted text – Checksum for an end-to-end check on the text

Process-Level Communication • Enable pairs of processes to communicate – Full duplex – Unbounded

Process-Level Communication • Enable pairs of processes to communicate – Full duplex – Unbounded but finite-length messages – E. g. , keystrokes or a file • Key ideas – Port numbers to (de)multiplex packets – Breaking messages into segments – Sequence numbers and reassembly – Retransmission and duplicate detection – Window-based flow control

Differences in Max Packet Size • Select smallest packet size as the new max?

Differences in Max Packet Size • Select smallest packet size as the new max? • Coordinate to determine max size on a path? • Enable gateway to fragment a large packet? – Reassembly by the next gateway? The receiver? • Design trade-offs – Coordination overhead for identifying the max – Overhead of sending many small packets – Overhead of buffering packets for reassembly

Discussion • What did they get right? – Which ideas were key to the

Discussion • What did they get right? – Which ideas were key to the Internet’s success? – Which decisions still seem right today? • What did they miss? – Which ideas had to be added later? – Which decisions seem wrong in hindsight? • What would you do in a clean-slate design? – If your goal wasn’t to support communication between disparate packet-switched networks – Would you do anything differently?

Separating IP from TCP • Original implementation – Only supported ordered reliable byte stream

Separating IP from TCP • Original implementation – Only supported ordered reliable byte stream – Fine for file transfer and remote login • Less appropriate for other applications – Interactive applications like voice – Let application decide whether/how to handle loss • Reorganization of the original TCP – IP: addressing/forwarding of individual packets – TCP: services such as flow control & loss recovery • Alternative transport protocols, e. g. , UDP

IP Packets

IP Packets

IP Packet Structure (for IPv 4 Packets) 4 -bit 8 -bit 4 -bit Version

IP Packet Structure (for IPv 4 Packets) 4 -bit 8 -bit 4 -bit Version Header Type of Service Length (TOS) 3 -bit Flags 16 -bit Identification 8 -bit Time to Live (TTL) 16 -bit Total Length (Bytes) 8 -bit Protocol 13 -bit Fragment Offset 16 -bit Header Checksum 32 -bit Source IP Address 32 -bit Destination IP Address Options (if any) Payload

IP Header: Version, Length, To. S • Version number (4 bits) – Indicates the

IP Header: Version, Length, To. S • Version number (4 bits) – Indicates the version of the IP protocol – Necessary to know what other fields to expect – E. g. “ 4” (for IPv 4), and sometimes “ 6” (for IPv 6) • Header length (4 bits) – Number of 32 -bit words in the header – Typically “ 5” (for a 20 -byte IPv 4 header) – Can be more when “IP options” are used • Type-of-Service (8 bits) – Allow differential treatment of packets – E. g. , low delay versus high bandwidth

IP Header: Length, Fragments, TTL • Total length (16 bits) – Number of bytes

IP Header: Length, Fragments, TTL • Total length (16 bits) – Number of bytes in the packet – Maximum size is 63, 535 bytes (216 -1) – … though underlying link may impose harder limits • Fragmentation information (32 bits) – Packet identifier, flags, and fragment offset – Supports dividing a large IP packet into fragments – … in case a link cannot handle a large IP packet • Time-To-Live (8 bits) – Used to identify packets stuck in forwarding loops – … and eventually discard them from the network

IP Header Fields: Transport Protocol • Protocol (8 bits) – Identifies the higher-level protocol

IP Header Fields: Transport Protocol • Protocol (8 bits) – Identifies the higher-level protocol • E. g. , “ 6” for Transmission Control Protocol • E. g. , “ 17” for the User Datagram Protocol – Needed for demultiplexing at receiving host • Indicates what kind of header to expect next protocol=6 protocol=17 IP header TCP header UDP header

IP Header: Checksum on the Header • Checksum (16 bits) – Sum of all

IP Header: Checksum on the Header • Checksum (16 bits) – Sum of all 16 -bit words in the IP packet header – If any bits of the header are corrupted in transit – … the checksum won’t match at receiving host – Receiving host discards corrupted packets • Sending host will retransmit the packet, if needed 134 + 212 134 + 216 = 346 = 350 Mismatch!

IP Header: To and From Addresses • Two IP addresses – Source IP address

IP Header: To and From Addresses • Two IP addresses – Source IP address (32 bits) – Destination IP address (32 bits) • Destination address – Unique identifier for the receiving host – Each node can make forwarding decisions • Source address – Unique identifier for the sending host – Recipient decides whether to accept packet – Enables recipient to reply back to source

Transport Protocols

Transport Protocols

Role of Transport Layer • Application layer – Between applications (e. g. , browsers

Role of Transport Layer • Application layer – Between applications (e. g. , browsers and servers) – E. g. , Hyper. Text Transfer Protocol, File Transfer Protocol, Network News Transfer Protocol, … • Transport layer – Between processes (e. g. , sockets) – Relies on network layer, & serves application layer – E. g. , TCP and UDP • Network layer – Between nodes (e. g. , routers and hosts) – Hides details of the link technology – E. g. , IP

Two Basic Transport Features • Demultiplexing: port numbers Server host 128. 2. 194. 242

Two Basic Transport Features • Demultiplexing: port numbers Server host 128. 2. 194. 242 Client host Service request for 128. 2. 194. 242: 80 (i. e. , the Web server) Client Web server (port 80) OS Echo server (port 7) • Error detection: checksums IP payload detect corruption

User Datagram Protocol (UDP) • Datagram messaging service – Demultiplexing of messages: port numbers

User Datagram Protocol (UDP) • Datagram messaging service – Demultiplexing of messages: port numbers – Detecting corrupted messages: checksum • Lightweight communication between processes – Send messages to and receive them from a socket – Avoid overhead and delays of ordered, reliable delivery SRC port DST port checksum length DATA

Why Would Anyone Use UDP? • Fine control over whether and when data are

Why Would Anyone Use UDP? • Fine control over whether and when data are sent – As soon as an application process writes into the socket – … UDP will package the data and send the packet • No delay for connection establishment – UDP just blasts away without any formal preliminaries – … which avoids introducing any unnecessary delays • No connection state – No allocation of buffers, parameters, sequence #s, etc. – … making it easier to handle many active clients at once • Small packet header overhead – UDP header is only eight-bytes long

Transmission Control Protocol (TCP) • Stream-of-bytes service – Sends and receives a stream of

Transmission Control Protocol (TCP) • Stream-of-bytes service – Sends and receives a stream of bytes, not messages • Reliable, in-order delivery – Checksums to detect corrupted data – Sequence numbers to detect losses and reorder data – Acknowledgments & retransmissions for reliable delivery • Connection oriented – Explicit set-up and tear-down of TCP session • Flow control – Prevent overflow of the receiver’s buffer space • Congestion control (came in late 1980 s) – Adapt to network congestion for the greater good

Transmission Control Protocol (TCP)

Transmission Control Protocol (TCP)

TCP Segment IP Data TCP Data (segment) TCP Hdr IP Hdr • IP packet

TCP Segment IP Data TCP Data (segment) TCP Hdr IP Hdr • IP packet – No bigger than Maximum Transmission Unit (MTU) – E. g. , up to 1500 bytes on an Ethernet • TCP packet – IP packet with a TCP header and data inside – TCP header is typically 20 bytes long • TCP segment – No more than Maximum Segment Size (MSS) bytes – E. g. , up to 1460 consecutive bytes from the stream

TCP Header Source port Destination port Sequence number Flags: SYN FIN RST PSH URG

TCP Header Source port Destination port Sequence number Flags: SYN FIN RST PSH URG ACK Acknowledgment Hdr. Len 0 Flags Advertised window Checksum Urgent pointer Options (variable) Data

TCP Header: Ports and Seq/Ack Numbers • Identifying the process end-point – Source port

TCP Header: Ports and Seq/Ack Numbers • Identifying the process end-point – Source port – Destination port • Delivering ordered reliable byte stream – Sequence number: # of first byte in the segment – Acknowledgment: # of next expected byte Byte 81 Sequence number = 1 st byte TCP Data Acknowledgment number = next byte

TCP Header: Length and Flags • Header length – Size of the TCP header

TCP Header: Length and Flags • Header length – Size of the TCP header – Usually 20 bytes, but higher if options are used • Flags to piggyback information – SYN: open connection – FIN: close connection – RST: abort connection – ACK: acknowledgment (in acknowledgement #) – PSH: not important – URG: not important (relates to Urgent pointer)

TCP Header: Checksum and Window • Checksum – Detect corruption of the TCP header

TCP Header: Checksum and Window • Checksum – Detect corruption of the TCP header and segment • Advertised window – Additional data the receiver can receive Window Size Data ACK’d Outstanding Un-ack’d data Data OK to send Data not OK to send yet

TCP Support for Reliable Delivery • Detect missing data: sequence number – – •

TCP Support for Reliable Delivery • Detect missing data: sequence number – – • Detect bit errors: checksum – – • Used to detect a gap in the stream of bytes. . . and for putting the data back in order Used to detect corrupted data at the receiver …leading the receiver to drop the packet Recover from lost data: retransmission – – Sender retransmits lost or corrupted data Two main ways to detect lost packets • Retransmission timeout and fast retransmission

Automatic Repeat re. Quest (ARQ) • Automatic Repeat re. Quest – Receiver sends acknowledgment

Automatic Repeat re. Quest (ARQ) • Automatic Repeat re. Quest – Receiver sends acknowledgment (ACK) when it receives packet – Sender waits for ACK and timeouts if it does not arrive within some time period Timeout Sender • Simplest ARQ protocol – Stop and wait – Send a packet, stop and wait until ACK arrives Time Receiver Packe ACK t

Packe t ACK Packet lost ACK Packe t ACK lost DUPLICATE PACKET Timeout Packe

Packe t ACK Packet lost ACK Packe t ACK lost DUPLICATE PACKET Timeout Packe t Timeout Packe Timeout Reasons for Retransmission Packe t K C A Packe t ACK Early timeout DUPLICATE PACKETS

Fast Retransmission • Better solution possible under sliding window – Although packet n might

Fast Retransmission • Better solution possible under sliding window – Although packet n might have been lost – … packets n+1, n+2, and so on might get through • Idea: have the receiver send ACK packets – ACK says that receiver is still awaiting nth packet • And repeated ACKs suggest later packets have arrived – Sender can view the “duplicate ACKs” as an early hint • … that the nth packet must have been lost • … and perform the retransmission early • Fast retransmission – Sender retransmits data after the “triple duplicate ACK”

TCP Congestion Control

TCP Congestion Control

Congestion is Unavoidable in IP • Best-effort delivery – Let everybody send – Try

Congestion is Unavoidable in IP • Best-effort delivery – Let everybody send – Try to deliver what you can – … and just drop the rest • If many packets arrive in short period of time – The node cannot keep up with the arriving traffic – … and the buffer may eventually overflow

The Problem of Congestion • What is congestion? – Load is higher than capacity

The Problem of Congestion • What is congestion? – Load is higher than capacity • What do IP routers do? – Drop the excess packets • Why is this bad? – Wasted bandwidth for retransmissions “congestion collapse” Goodput Load Increase in load that results in a decrease in useful work done.

Many Important Questions • How does the sender know there is congestion? – Explicit

Many Important Questions • How does the sender know there is congestion? – Explicit feedback from the network? – Inference based on network performance? • How should the sender adapt? – Explicit sending rate computed by the network? – End host coordinates with other hosts? – End host thinks globally but acts locally? • What is the performance objective? – Maximizing goodput, even if some users suffer more? – Fairness? (Whatever the heck that means!) • How fast should new TCP senders send?

Inferring From Implicit Feedback ? • What does the end host see? – Round-trip

Inferring From Implicit Feedback ? • What does the end host see? – Round-trip loss – Round-trip delay

Host Adapts Sending Rate Over Time • Congestion window – Maximum number of bytes

Host Adapts Sending Rate Over Time • Congestion window – Maximum number of bytes to have in transit – I. e. , # of bytes still awaiting acknowledgments • Upon detecting congestion – Decrease the window size (e. g. , divide in half) – End host does its part to alleviate the congestion • Upon not detecting congestion – Increase the window size, a little at a time – And see if the packets are successfully delivered – End host learns whether conditions have changed

Leads to the TCP “Sawtooth” Window size Loss halved Time

Leads to the TCP “Sawtooth” Window size Loss halved Time

Receiver Window vs. Congestion Window • Flow control – Keep a fast sender from

Receiver Window vs. Congestion Window • Flow control – Keep a fast sender from overwhelming a slow receiver • Congestion control – Keep a set of senders from overloading the network • Different concepts, but similar mechanisms – TCP flow control: receiver window – TCP congestion control: congestion window – TCP window: min{congestion window, receiver window}

How Should a New Flow Start Need to start with a small CWND to

How Should a New Flow Start Need to start with a small CWND to avoid overloading the network. Window But, could take a long time to get started! t

“Slow Start” Phase • Start with a small congestion window – Initially, CWND is

“Slow Start” Phase • Start with a small congestion window – Initially, CWND is 1 Max Segment Size (MSS) – So, initial sending rate is MSS/RTT • That could be pretty wasteful – Might be much less than the actual bandwidth – Linear increase takes a long time to accelerate • Slow-start phase – Sender starts at a slow rate (hence the name) – … but increases the rate exponentially – … until the first loss event

Slow Start and the TCP Sawtooth Window Loss Exponential “slow start” t Why is

Slow Start and the TCP Sawtooth Window Loss Exponential “slow start” t Why is it called slow-start? Because TCP originally had no congestion control mechanism. The source would just start by sending a whole receiver window’s worth of data.

Two Kinds of Loss in TCP • Timeout – Packet n is lost and

Two Kinds of Loss in TCP • Timeout – Packet n is lost and detected via a timeout • E. g. , because all packets in flight were lost – After timeout, blasting away for the entire CWND would trigger a very large burst in traffic – So, better to start over with a low CWND • Triple duplicate ACK – Packet n is lost, but packets n+1, n+2, etc. arrive • Receiver sends duplicate acknowledgments – And the sender retransmits packet n quickly – Do a multiplicative decrease and keep going

Repeating Slow Start After Timeout Window timeout Slow start in operation until it reaches

Repeating Slow Start After Timeout Window timeout Slow start in operation until it reaches half of t cwnd. previous Slow-start restart: Go back to CWND of 1, but take advantage of knowing the previous value of CWND.

What About Inefficiency? • TCP congestion control is not very efficient – The sawtooth

What About Inefficiency? • TCP congestion control is not very efficient – The sawtooth behavior is wasteful – Short flows never ramp up to max rate – Poor performance on high-bandwidth paths – Poor performance on long-RTT paths • Ongoing work on improvements to TCP – Better information about network conditions • Measurement of available bandwidth on a path • Explicit feedback from the routers – Better performance under high bandwidth-delay product (e. g. , bulk data transfer between labs)

What About Cheating? • Some folks are more fair than others – Running multiple

What About Cheating? • Some folks are more fair than others – Running multiple TCP connections in parallel – Modifying the TCP implementation in the OS – Use the User Datagram Protocol (UDP) • What is the impact – Good guys slow down to make room for you – You get an unfair share of the bandwidth • Possible solutions? – Routers detect cheating and drop excess packets? – Peer pressure? – Move congestion control to the network?