15 744 Computer Networks Background Material 1 Getting
15 -744 Computer Networks Background Material 1: Getting stuff from here to there
Outline • Addressing • Forwarding • Name Lookup • Routing • Other 2
Addressing yak… • Who should receive the message? • Addressing determines which packets are kept and which are packets are thrown away 3
Addressing Alternatives • Packets can be sent to: • Unicast – one destination • Multicast – group of nodes (e. g. “everyone playing Quake”) • Broadcast – everybody (in scope) • Dynamic addresses (e. g. Appletalk) • Pick an address at random • Broadcast “is anyone using address XX? ” • If yes, repeat • Address assignment (e. g. Ethernet, IP) 4 4
Ethernet Frame Structure • Sending adapter encapsulates IP datagram (or other network layer protocol packet) in Ethernet frame 5 5
Ethernet Frame Structure (cont. ) • Addresses: 6 bytes • Each adapter is given a globally unique address at manufacturing time • Address space is allocated to manufacturers • 24 bits identify manufacturer • E. g. , 0: 0: 15: * 3 com adapter • Frame is received by all adapters on a LAN and dropped if address does not match • Special addresses • Broadcast – FF: FF: FF: FF is “everybody” • Range of addresses allocated to multicast • Adapter maintains list of multicast groups node is interested in 6 6
IP Addresses • Fixed length: 32 bits • Initial classful structure (1981) (not relevant now!!!) • Total IP address size: 4 billion • Class A: 128 networks, 16 M hosts • Class B: 16 K networks, 64 K hosts • Class C: 2 M networks, 256 hosts High Order Bits 0 10 110 Format 7 bits of net, 24 bits of host 14 bits of net, 16 bits of host 21 bits of net, 8 bits of host Class A B C 7
IP Address Classes (Some are Obsolete) Network ID Host ID 8 Class A 0 Network ID 16 24 32 Host ID Class B 10 Class C 110 Class D 1110 Multicast Addresses Class E 1111 Reserved for experiments 8
IP Addresses: How to Get One? Network (network portion): • Allocated from ISP’s address space: ISP's block 11001000 00010111 00010000 200. 23. 16. 0/20 Organization 0 11001000 00010111 00010000 200. 23. 16. 0/23 Organization 1 11001000 00010111 00010010 0000 200. 23. 18. 0/23 Organization 2. . . 11001000 00010111 00010100 0000 …. 200. 23. 20. 0/23 …. Organization 7 11001000 00010111 00011110 0000 200. 23. 30. 0/23 9
IP Addresses: How to Get One? • How does an ISP get block of addresses? • From Regional Internet Registries (RIRs) • ARIN (North America, Southern Africa), APNIC (Asia-Pacific), RIPE (Europe, Northern Africa), LACNIC (South America) • How about a single host? • Hard-coded by system admin in a file • DHCP: Dynamic Host Configuration Protocol: dynamically get address: “plug-and-play” • Host broadcasts “DHCP discover” msg • DHCP server responds with “DHCP offer” msg • Host requests IP address: “DHCP request” msg • DHCP server sends address: “DHCP ack” msg 10
Subnet Addressing RFC 917 (1984) • Class A & B networks too big • Very few LANs have close to 64 K hosts • For electrical/LAN limitations, performance or administrative reasons • Need simple way to get multiple “networks” • Use bridging, multiple IP networks or split up single network address ranges (subnet) • CMU case study in RFC • Chose not to adopt – concern that it would not be widely supported 11
Classless Inter-Domain Routing (CIDR) – RFC 1338 • Allows arbitrary split between network & host part of address • Do not use classes to determine network ID • Use common part of address as network number • E. g. , addresses 192. 4. 16 - 192. 4. 31 have the first 20 bits in common. Thus, we use these 20 bits as the network number 192. 4. 16/20 12
Outline • Addressing • Forwarding • Name Lookup • Routing • Other 13
Internet[work] • A collection of interconnected networks • Host: network endpoints (computer, PDA, light switch, …) • Router: node that connects networks Internet[work] • How do we forward messages? 14
Sharing a Medium We need to figure out a few things: • Encoding: how to encode a sequence of bits over the link? • E. g. , NRZ, Manchester, 4 B/5 B • Framing: how to mark the boundaries of a frame? • E. g. , bit-oriented, byte-oriented, or clock-based framing • Error detection • E. g. , parity, checksums, CRC • Error recovery: acknowledgements, timeout • Medium access control 15
Ethernet MAC (CSMA/CD) • Carrier Sense Multiple Access/Collision Detection Packet? Sense Carrier No Send Detect Collision Yes Discard Packet attempts < 16 Jam channel b=Calc. Backoff(); wait(b); attempts++; attempts == 16 16
Scale yak… • What breaks when we keep adding people to the same wire? 17
Scale yak… • What breaks when we keep adding people to the same wire? • Only solution: split up the people onto multiple wires • But how can they talk to each other? 18
Problem 1 – Reconnecting LANs yak… • How do you specify a destination? • When should these boxes forward packets between wires? • How does your packet find its way? 19
1. Global Address Example Packet R Sender R 2 1 R 1 4 R 3 3 2 1 1 R 2 4 3 R 4 R 2 R 3 4 R 3 3 R Receiver 20
Transparent Bridges / Switches • Design goals: • Self-configuring without hardware or software changes • Bridge do not impact the operation of the individual LANs • Three parts to making bridges transparent: 1) Learning addresses/host locations 2) Forwarding frames 3) Spanning tree algorithm 21
Learning and Frame Forwarding Bridge 2 MAC Address A 21032 C 9 A 591 99 A 323 C 90842 8711 C 98900 AA 301 B 2369011 C 695519001190 Port Age 1 2 2 36 2 3 16 01 15 11 1 3 • A table entry: A machine with MAC Address lies in the direction of number port of the bridge • For every packet, the bridge “looks up” the entry for the packets destination MAC address and forwards the packet on that port. • Other packets are broadcast – why? • Timer is used to flush old entries • The switch “learns” what MAC addresses are reachable from each of its ports. 22
Spanning Tree Bridges • More complex topologies can provide redundancy. • But can also create loops. • What is the problem with loops? • Solution: spanning tree host host Bridge host host 23
Original IP Route Lookup • Address would specify prefix forwarding table • Simple lookup • www. cmu. edu address 128. 2. 11. 43 • Class B address – class + network is 128. 2 • Lookup 128. 2 in forwarding table • Prefix – part of address that really matters for routing • Forwarding table contains • List of class+network entries • A few fixed prefix lengths (8/16/24) • Large tables • 2 Million class C networks 24
Classless Inter-Domain Routing (CIDR) – RFC 1338 • Allows arbitrary split between network & host part of address • Do not use classes to determine network ID • Use common part of address as network number • E. g. , addresses 192. 4. 16 - 192. 4. 31 have the first 20 bits in common. Thus, we use these 20 bits as the network number 192. 4. 16/20 • Enables more efficient usage of address space (and router tables) How? • Use single entry for range in forwarding tables • Combined forwarding entries when possible 25
Longest Prefix Match: Search for the routing table entry that has the longest match with the prefix of the destination IP address 128. 143. 71. 21 1. Search for a match on all 32 bits 2. Search for a match for 31 bits …. . 32. Search for a match on 0 bits Host route, loopback entry 32 -bit prefix match Default route is represented as 0. 0/0 0 -bit prefix match The longest prefix match for 128. 143. 71. 21 is for 24 bits with entry 128. 143. 71. 0/24 packet will be sent to R 4 26
Route Aggregation • Longest prefix match algorithm permits the aggregation of prefixes with identical next hop address to a single entry Destination Next Hop 20. 2. 0. 0/16 R 2 20. 1. 1. 0/28 R 2 Destination Next Hop 20. 0/14 R 2 27
Other ways to forward: 2. Source Routing Example Packet R 2, R 3, R R 1, R 2, R 3, R 2 Sender 1 R 1 4 2 3 1 1 R 2 4 3 R 3, R 2 R 3 4 3 R Receiver 28
3. Virtual Circuits Example Packet Sender 5 5 2 1 R 1 4 3 2 1 R 2 4 conn 5 3 1 conn 5 4 3 5 2 R 3 4 3 5 Receiver conn 5 3 29
3. Virtual Circuit IDs/Switching: Label (“tag”) Swapping A 1 3 1 2 R 2 3 4 1 R 1 2 4 B 3 R 4 1 2 R 3 3 2 Dst 4 4 • Global VC ID allocation -- ICK! Solution: Per-link uniqueness. Change VCI each hop. Input Port R 1: 1 R 2: 2 R 4: 1 Input VCI 5 9 2 Output Port Output VCI 3 9 4 2 3 5 30
Comparison of forwarding approaches Source Routing Global Addresses Virtual Circuits Header Size Worst OK – Large address Best Router Table Size None Number of hosts (prefixes) Number of circuits Forward Overhead Best Prefix matching (Worst) Pretty Good Setup Overhead None Connection Setup Tell all routers and Tear down circuit and re-route Error Recovery 9 -20 -07 Tell all hosts Lecture 7: Addressing/Forwarding 31
Outline • Addressing • Forwarding • Name Lookup • Routing • Other 32
Link Layer Address Lookup • How does one find the Ethernet address of an IP host? • ARP (Address Resolution Protocol) • Broadcast search for IP address • E. g. , “who-has 128. 2. 184. 45 tell 128. 2. 206. 138” sent to Ethernet broadcast (all FF address) • Destination responds (only to requester using unicast) with appropriate 48 -bit Ethernet address • E. g, “reply 128. 2. 184. 45 is-at 0: d 0: bc: f 2: 18: 58” sent to 0: c 0: 4 f: d: ed: c 6 33
DNS Design: Zone Definitions • Zone = contiguous section of name space root org edu com uk net gwu ucb cmu cs cmcl ece bu mit ca • E. g. , Complete tree, single node or subtree • A zone has an associated set of name servers • Must store list of names and tree links Subtree Single node Complete Tree 34
Typical Resolution www. cs. cmu. edu Client Local DNS server . edu u. cm s c. www edu. u 1. cm s n NS NS ns 1. cs. cm u. edu Aw ww =IP a dd r root & edu DNS server ns 1. cmu. edu DNS server ns 1. cs. cmu. edu DNS server 35
Workload and Caching • Are all servers/names likely to be equally popular? • Why might this be a problem? How can we solve this problem? • DNS responses are cached • Quick response for repeated translations • Other queries may reuse some parts of lookup • NS records for domains • DNS negative queries are cached • Don’t have to repeat past mistakes • E. g. misspellings, search strings in resolv. conf • Cached data periodically times out • Lifetime (TTL) of data controlled by owner of data • TTL passed with every record 36
Subsequent Lookup Example root & edu DNS server ftp. cs. cmu. edu Client Local DNS server ftp . cs . cm =IP u. e cmu. edu DNS server ad dr du cs. cmu. edu DNS server 37
Outline • Addressing • Forwarding • Name Lookup • Routing • Other 38
1. Global Address Example Packet R Sender R 2 1 R 1 4 R 3 3 2 1 1 R 2 4 3 R 4 R 2 R 3 4 R 3 3 R Receiver 39
Distance-Vector Routing Initial Table for A Dest Cost Next Hop A 0 A B 4 B C – D – E 2 E F 6 F E 3 C 1 1 F 2 6 1 A 3 4 D B • Idea • At any time, have cost/next hop of best known path to destination • Use cost when no path known • Initially • Only have entries for directly connected nodes 40
Distance-Vector Update z d(z, y) c(x, z) y x d(x, y) • Update(x, y, z) d c(x, z) + d(z, y) # Cost of path from x to y with first hop z if d < d(x, y) # Found better path return d, z # Updated cost / next hop else return d(x, y), nexthop(x, y) # Existing cost / next hop 41
Distance Vector: Link Cost Changes Link cost changes: • Good news travels fast • Bad news travels slow “count to infinity” problem! 60 X 4 Y 50 1 Z algorithm continues on! 42
Distance Vector: Split Horizon If Z routes through Y to get to X : • Z does not advertise its route to X back to Y 60 X 4 Y 1 50 Z algorithm terminates ? ? ? 43
Distance Vector: Poison Reverse If Z routes through Y to get to X : l l l Z tells Y its (Z’s) distance to X is infinite (so Y won’t route to X via Z) Eliminates some possible timeouts with split horizon Will this completely solve count to infinity problem? 60 X 4 Y 50 1 Z algorithm terminates 44
Poison Reverse Failures Table for A Table for B Table for D Table for F Dst Cst Hop C 7 F C 8 A C 9 B C 1 C Forced Update Table for A Dst Cst Hop C – Table for F Dst Cst Hop C – u Forced Update d p A Table for A Dst Cst Hop C 13 D Forced Update Better Route 1 Table for B Dst Cst Hop C 14 A Forced Update Dst Cst Hop C 19 D Forced Update • • • C 1 4 B D Table for A F 6 1 Dst Cst Hop C 15 B • • “Count to infinity” Solution • • Make “infinity” smaller What is upper bound on maximum path length? 45
Link State Protocol Concept • Every node gets complete copy of graph • Every node “floods” network with data about its outgoing links • Every node computes routes to every other node • Usingle-source, shortest-path algorithm • Process performed whenever needed • When connections die / reappear 46
Sending Link States by Flooding • X Wants to Send Information • Sends on all outgoing links • When Node B Receives Information from A • Send on all links other than A X A C B D X A C B (a) X A C B (c) D (b) D X A C B D (d) 47
Comparison of LS and DV Algorithms Message complexity • LS: with n nodes, E links, O(n. E) messages • DV: exchange between neighbors only O(E) Space requirements: • LS maintains entire topology • DV maintains only neighbor state Speed of Convergence • LS: Complex computation • But…can forward before computation • may have oscillations • DV: convergence time varies • may be routing loops • count-to-infinity problem • (faster with triggered updates) 48
Routing Hierarchies • Flat routing doesn’t scale • Storage Each node cannot be expected to store routes to every destination network • Convergence times increase • Communication Total message count increases • Key observation • Need less information with increasing distance to destination • Need lower diameters networks • Solution: area hierarchy 49
Routing Hierarchy Area-Border Router Backbone Areas Lower-level Areas • Partition Network into “Areas” • Within area • Each node has routes to every other node • Outside area • Each node has routes for other top-level areas only • Inter-area packets are routed to nearest appropriate border router • Constraint: no path between two sub-areas of an area can exit that area 50
Area Hierarchy Addressing 1 2 2. 1 1. 1 2. 2. 1 1. 2. 2 start end 3 3. 2. 1 3. 2 51
Path Sub-optimality • Can result in sub-optimal paths 1 2 2. 1 1. 1 2. 2. 1 1. 2. 1 start end 3. 2. 1 3 3 hop red path vs. 2 hop green path 3. 1 3. 2 52
Outline • Addressing • Forwarding • Name Lookup • Routing • Other 53
IP Service Model • Best effort service • Datagram • Each packet self-contained • All information needed to get to destination • No advance setup or connection maintenance • Analogous to letter or telegram 0 4 version IPv 4 Packet Format 8 HLen 12 19 TOS Identifier TTL 16 24 28 31 Length Flag Protocol Offset Checksum Header Source Address Destination Address Options (if any) Data 54
IP Fragmentation Example Length = 1500, M=1, Offset = 0 host router IP Header MTU = 1500 Length = 2000, M=1, Offset = 0 IP Header Length = 1840, M=0, Offset = 1980 IP Header 1480 bytes Length = 520, M=1, Offset = 1480 IP Data 1980 bytes IP Data 1820 bytes IP Data IP Header IP Data Length = 1500, M=1, Offset = 1980 IP Header IP Data 1480 bytes 500 bytes Length = 360, M=0, Offset = 3460 IP Header IP Data 340 bytes 55
Hybrid Solutions: MPLS core, IP interface MPLS tag assigned IP IP A MPLS tag stripped IP 1 3 1 2 R 2 3 4 IP C 1 R 1 2 B 4 3 R 4 1 2 R 3 3 2 4 4 D MPLS forwarding in core 56
Important Concepts • IP service • Best effort, simplicity of routers • Allows highly decentralized implementation • Each step involves determining next hop • IP forwarding • global addressing, alternatives, lookup tables • IP addressing • hierarchical, CIDR • IP packets • header fields, fragmentation 57
Dealing with Errors: Stop and Wait • Packets can get lost, corrupted, or duplicated. • Error detection or correction turns corrupted packet in lost or correct packet • Duplicate packet: use sequence numbers. • Lost packet: time outs and acknowledgements. • Window based flow control: more aggressive use of sequence numbers (see transport lectures). Sender Receiver 81
Ethernet Frame Structure • Sending adapter encapsulates IP datagram (or other network layer protocol packet) in Ethernet frame 82 82
Ethernet Frame Structure (cont. ) • Preamble: 8 bytes • 101010… 1011 • Used to synchronize receiver, sender clock rates • CRC: 4 bytes • Checked at receiver, if error is detected, the frame is simply dropped 83 83
Ethernet Frame Structure (cont. ) • Each protocol layer needs to provide some hooks to upper layer protocols • Demultiplexing: identify which upper layer protocol packet belongs to • E. g. , port numbers allow TCP/UDP to identify target application • Ethernet uses Type field • Type: 2 bytes • Indicates the higher layer protocol, mostly IP but others may be supported such as Novell IPX and Apple. Talk 84 84
Ethernet Backoff Calculation • Exponentially increasing random delay • Infer senders from # of collisions • More senders increase wait time • First collision: choose K from {0, 1}; delay is K x 512 bit transmission times • After second collision: choose K from {0, 1, 2, 3}… • After ten or more collisions, choose K from {0, 1, 2, 3, 4, …, 1023} 85
- Slides: 62