Protocol implementation Nexthop resolution Reliability and graceful restart

Protocol implementation • Next-hop resolution • Reliability and graceful restart

What is a next-hop • The destination of the packets I am sending – Not the same as the interface – An ethernet interface will have many nodes behind it – Directly connected next hop is 1 hop away • E. g. RSVP sends a PATH message to the next downstream node – Next hop may be directly connected (strict ERO) – Or not (loose ERO) • OSPF sends an LS update to the other end of a link or a neighbor on an eithernet – Always directly connected • BGP has an i. BGP-next hop for each of its paths – Not directly connected

Next-hop • If the next hop is not directly connected the way to reach it depends on the IGP – May change when IGP routing changes – Will have to use a different interface to reach it – Need to keep track of these changes • Next hop resolution

Next hop resolution • Periodic resolution – may take a bit more time • But next-hops will not be too many • Or will they? Tunnels, VLANs … – Quagga uses this approach • Through the IPV 4_LOOKUP_NEXTHOP command • Registration/notification – RSVP would tell zebra which nexthops it is interested in – Zebra will notify RSVP when something changes in the IGP path to it • Better scaling for RSVP • Difficult to ensure good scaling inside zebra – Various protocols may register 1000 s of next hops • More complex code in zebra

Network Reliability • Availability: How many nines? – 99. 999% is 5. 26 min down time/year – 99. 9999% is 31. 5 sec down time/year • Telephone networks are between 5 and 6 nines – Internet will have to get there – Currently at 4 nines? (vendors claim 5) – Very important with the new types of traffic • Voip, Ipvt • What can go wrong (% of failures for US telephone network ca. 1992): – Hardware failures (19%) – Software failures (14%) – Human errors (49%) – Vandalism/Terrorism – Acts of nature (11%) – Overload (6% but had the largest impact on customers)

Hardware failures • Link failures – Protocols can cope with that • Re-route, may be slow • More aggressive repair methods – we will see them later • Router failures – Can not do much just add redundancy • Power supplies, fans, disks, etc – Line-card failure is similar to a link failure – Control processor failure is more serious • Always have two of them • Primary and backup

Modern Router architectures • Dual controllers – For running the control plane • Multiple line-cards – Can operate without the controllers – Router can forward traffic even when the control plane crashes – Called non-stop forwarding or head-less operation

Software failures • When primary fails start using backup – Switchover • Must be as fast as possible – Things in the network change in the meanwhile – Need to minimize this window • What happens with the control software – Need to keep primary and backup instance in sync – How tight is this synchronization?

Tight synchronization • Both primary and backup are active, keep them in sync by: • Send them both the same input (I. e. duplicate control packets) – Fastest possible switchover – Expensive, may need to duplicate packets – Does not work for TCP based protocols • The primary keeps sending state updates to the backup – May need to send too many messages • Being totally in-sync is not easy – Needs transactional communication

Loose synchronization • Backup is idle – But we keep configuration up to date – Each configuration change on the primary is mirrored on the backup • Backup instance is started when the primary fails – Switchover will take longer • Much-much simpler – Configuration changes are much less • Variation: – Keep only the RIB process in sync in both primary and backup

Non-stop forwarding • Key concept – forwarding happens in the line cards – Even if control processor fails forwarding can continue – Non stop forwarding, head-less operation • Old Common sense: when router s/w crashes do not use the router – But with head-less operation it is ok to continue using routers that their s/w crashed – Assuming their s/w will be operational again soon

Special Case • Planned restart – For s/w upgrade • These are a significant percentage of downtime – For refresh • Memory is leaking but s/w still operational • Restart to get a clean start • I can use graceful restart

Graceful Restart • Other routers in the network will keep using a neighbor router – Even if is looks like its control plane has failed – Assuming it will come back soon • Needs coordination – The failed router needs to do some special processing when it comes back – It has to tell its neighbors first that it supports graceful restart • Zero impact on the network – The failed router will have the chance to restart its s/w and come back – Nobody in the rest of the network will know that something happened

How does it work • Used for all protocols by now – OSPF, BGP, RSVP-TE… • The neighbor will discover that the router is dead or it has restarted – HELLO timeout, different information in the HELLOs etc… – But will ignore it for a certain time period • If the failed router comes back within this period – It will re-sync its state (database exchange for OSPF, resend all the LSPs for RSVP, …) – And all is back to normal

Example RSVP • Use HELLOs • Special recovery label messages • Restarting router needs to remember the labels it allocated before the crash – Where? • Shared memory • recover them from the forwarding plane – Why? • Must use the same labels again • Must make sure it does not use an allocated label for some other LSP

Example OSPF • Trick is to re-establish the adjacencies after a failure • Remember the set of neighbors – Shared memory or in the backup controller • After restart do not originate any LSAs • Just re-establish adjacencies and re-sync database

Graceful restart catches • All routers in the network should implement this to work • Mostly for planned restarts: – S/w upgrades – Refreshes (if a router runs low on memory) – But it is possible to use for crashes too! • It can not work if something changes in the network while the restart is going on – There may be routing loops

Router self-monitoring • Automatically restart failed or stuck processes • A separate monitor process – Keeps an eye on other processes – If there is a failure the failed process is restarted • Of course it may fail again – Heart-beats to determine liveness – Failure may not necessarily be a crash • Could be a software bug that causes an infinite loop or very-very slow processing

Why is it important • Remember the Po. P structure – Need dual routers for reliability – If I had a single router that was extra-reliable I could save a lot of money

Issues • Strict Isolation – VMs – Other methods • Global resource coordination – For example memory