Case Studies on IntraDomain Routing Instability Zhang Shu
Case Studies on Intra-Domain Routing Instability Zhang Shu Communications Research Laboratory, Japan (To be renamed to National Institute of Information and Communications Technology) APAN 17 – Engineering Session 1/30/2004, Hawaii 1
Overview n n n What is routing instability? Methodology of the measurement Case study 1: WIDE Internet Case study 2: APAN Tokyo-XP Conclusion and future work 2
Routing Instability n Routing instability • Also called route flaps • Unexpected topology change n Bad influence • • • n Packet loss Increased router load Wasted bandwidth Causes • Link failure, software bug n Types of routing instability • Inter-domain • Intra-domain 3
Methodology n Methodology • Use “tcpdump” to collect link state routing messages • Then analyze the routing messages by self-made tools Ospfanaly n Some other scripts n • Include a CGI perl script to view the statistical results by web 4
OSPF n Open Shortest Path First • A widely deployed intra-domain link state routing protocol • OSPFv 2 and OSPFv 3 n Link state advertisements (LSAs) • OSPFv 2 n n n Router-LSA Network-summary-LSA AS-external-LSA n n Network-LSA ASBR-summary-LSA • OSPFv 3 n Seven kinds of LSAs defined in RFC 2740 5
Case Study One: WIDE Internet n WIDE Internet • WIDE Project n http: //www. wide. ad. jp • Connecting hundreds of organizations n NARA-NOC • Located in Nara Institute of Science and Technology, Japan • The measurement machine is placed into one ethernet segment of the NARA-NOC network 6
Measurement Result of WIDE Internet (OSPFv 2) Number of LSA changes Number of LSAs 7 Date (Year/Month)
The Case of OSPFv 3 Number of LSA changes Number of LSAs 8 Date (Year/Month)
Other Findings during the Analysis n Sometimes serious LSA oscillation happened • The change happens with the interval of 10 s -200 s • Usually lasts for hours, sometimes for days n Oscillation of router-LSA • Most of the observed oscillation was the repeated up/down of routers’ interfaces 9
The Causes of the Flaps n The isolated causes • Congestion n DDo. S attacks • Operation miss n Mis-configuration of router ID • Software/Hardware bug n n Zebra routing daemon Cisco’s OSPF bug Foundry switch The causes of much flaps are still unknown • The flaps occur randomly n Why the flaps decrease in the recent months? • The change of routing protocol implementation style n Special process on routing messages • Bandwidth 10
Case Study Two: APAN Tokyo-XP n APAN Tokyo-XP • Located in Otemachi, Tokyo • Seven routers in the backbone area • Data collected on a Free. BSD box connected to a ethernet segment 11
Measurement Result of APAN Tokyo-XP (OSPFv 2) Number of LSA changes Number of LSAs Date (Year/Month) Although most of the updates are due to router maintenance, there still unknown ones. 12
Conclusion n Our investigation on WIDE Internet • OSPF LSA oscillation may occur frequently sometimes • Sometimes serious oscillation occurred • It is difficult to determine what caused the flaps n Similar phenomenon may be found on other networks, so it is important to deploy a measurement system on different networks 13
Future Work n To do more measurement on other networks • Abilene of Internet 2 n n To improve our monitoring system To isolate the causes • When detects oscillation, obtain helpful data for troubleshooting 14
If you would like to conduct a routing instability measurement on your own network, please contact Zhang Shu zhang@koganei. wide. ad. jp Thank you for your attention! 15
- Slides: 15