Network Problems and Tools Part 1 ITEC 370

Network Problems and Tools Part 1 ITEC 370 George Vaughan Franklin University 1

Sources for Slides • Material in these slides comes primarily from course text, Guide to Networking Essentials, Tomsho, Tittel, Johnson (2007). • Other sources are cited in line and listed in reference section. 2

TCP/IP and OSI Models 3

arp Utility • arp displays (and can be used to modify) the IP Address-to- Ethernet address translation tables used by the address resolution protocol. • The arp cache is updated using the arp protocol. • Example: >arp -a Interface: 192. 168. 1. 101 --- 0 x 2 Internet Address Physical Address 192. 168. 1. 1 00 -16 -b 6 -21 -71 -d 1 192. 168. 1. 10 00 -0 d-88 -4 b-8 f-6 d Type dynamic (My router) dynamic (My printer) 4

traceroute Utility • In Unix and Linux, command is called: traceroute • In Windows, command is called: tracert • Below is a description from the Unix man page for traceroute: – The Internet is a large and complex aggregation of network hardware, connected together by gateways. – Tracking the route one's packets follow (or finding the miscreant gateway that's discarding your packets) can be difficult. – Traceroute utilizes the IP protocol `time to live' field and attempts to elicit an ICMP TIME_EXCEEDED response from each gateway along the path to some host. 5

traceroute Utility (Cont. ) C: Documents and SettingsCompaq_Administrator>tracert www. google. com Tracing route to www. l. google. com [64. 233. 167. 147] over a maximum of 30 hops: 1 2 3 7] 1 ms 8 ms 6 ms <1 ms 7 ms <1 ms py-in-f 147. google. com [64. 233. 167. 147] 8 ms 10. 73. 0. 1 9 ms gig 2 -2. teleoh 1 -ybr 1. columbus. rr. com [24. 95. 81. 23 …. . 9 26 ms 27 ms 28 ms ae-21 -56. car 1. Chicago 1. Level 3. net [4. 68. 101. 162] 10 18 ms 24 ms 19 ms GOOGLE-INC. car 1. Chicago 1. Level 3. net [4. 79. 208. 18 ] 11 21 ms 12 21 ms 23 ms 13 19 ms 20 ms Trace complete. 22 ms 66. 249. 94. 133 28 ms 72. 14. 232. 70 21 ms py-in-f 147. google. com [64. 233. 167. 147] 6

ping Utility • From Ping man page: – Ping uses the ICMP protocol's mandatory ECHO_REQUEST datagram to elicit an ICMP ECHO_RESPONSE from a host or gateway. – ECHO_REQUEST datagrams (``pings'') have an IP and ICMP header, followed by a ``struct timeval'' and then an arbitrary number of ``pad'' bytes used to fill out the packet. 7

ping Utility (Cont. ) Example: C: Documents and SettingsCompaq_Administrator>ping www. whitehouse. gov Pinging a 1289. g. akamai. net [8. 15. 32. 42] with 32 bytes of data: Reply from 8. 15. 32. 42: bytes=32 time=35 ms TTL=55 Reply from 8. 15. 32. 42: bytes=32 time=34 ms TTL=55 Reply from 8. 15. 32. 42: bytes=32 time=35 ms TTL=55 Ping statistics for 8. 15. 32. 42: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 34 ms, Maximum = 35 ms, Average = 34 ms 8

nslookup Utility • Knowing the IP address of a remote server, nslookup can be used to lookup the remote IP address. • Example $ nslookup www. yahoo. com Server: dns-cac-lb-01. ohiordc. rr. com Address: 65. 24. 7. 3 (address of my local DNS) Non-authoritative answer: Name: www. yahoo-ht 2. akadns. net Address: 69. 147. 114. 210 (IP address of www. yahoo. com) Aliases: www. yahoo. com 9

WHOIS • http: //ws. arin. net/cgi-bin/whois. pl 10

Solving Network Problems • Network Problem Solving is divided into to major areas: – Pre-emptive Troubleshooting • Preventing problems through planning and management. – Troubleshooting • Controlling and repairing damage 11

Develop a Backup Plan (Pre-emptive Troubleshooting) • Identify backup strategies for different data types, such as applications, billing records, etc. • Develop backup schedule (full, partial, multi-level backups, etc. ) • Identify backup team • Test backups regularly • Maintain backup log • Identify backup storage strategy (on-site, off-site, etc. ) 12

Determining Backup Needs (Pre-emptive Troubleshooting) • Can you tolerate the loss of everything? • Can you tolerate the loss of some filesystems or files? which ones? • How often is this critical data changing? • How long can you wait before it is restored? 13

Determining Backup Needs (Pre-emptive Troubleshooting) • How old can the restored version be (hours, days, weeks)? • How much can you afford to spend on a backup strategy? • Does your system need to be available 24 x 7? 14

Backup Strategies (Pre-emptive Troubleshooting) • Different Strategies may be applied to different filesystems. • 2 Types of backups – Full Backup – Incremental Backup 15

Backup Strategies (Pre-emptive Troubleshooting) • Full Backup – Backup everything – Can take a long time – Can consume a lot of backup media – Simplest to restore from 16

Backup Strategies (Pre-emptive Troubleshooting) • Incremental: – only backup files that changed since some point in time. – Faster Backups – Less consumption of backup media – More complicated restore process – Still need to do full backup every once in a while 17

Multi-level backup (Pre-emptive Troubleshooting) • A popular strategy: multi-level backup – Level 0: Full Backup. – Level 1: Incremental backup since last level 0 backup. – Level 2: Incremental backup since last level 1 backup. 18

Multi-level backup Sun Mon Tue Wed Thur Fri Sat 1 2 3 4 5 6 7 1 0 8 9 2 10 1 15 16 17 23 2 2 14 2 20 2 26 2 2 13 19 25 2 2 12 18 24 1 • • 11 2 1 22 2 21 2 27 2 28 2 Level 0: First Sunday of Month Level 1: Every Monday Level 2: Every Tuesday through Friday If I accidentally deleted my directory on the 25 th, which backups do I need? 19

Backup Verification (Pre-emptive Troubleshooting) • You never know how good your backups are until you need to restore. • You can’t wait till disaster hits only to find that your tape units were never working. • Need to periodically check/verify backups – against original files – on alternative machines – for backup media degradation 20

Storage (Pre-emptive Troubleshooting) • Where should you store your backups? • Maybe store level 1 and level 2 in an alternate location? • What about archived data? 21

Organization of Backups (Pre-emptive Troubleshooting) • Labels – Color Coded – Printed • Dedicated Shelf Location – By Day of Week? – By Week of Month? • 3 rd Party Software – Stored separately? 22

Define Internal HW and SW Standards (Pre-emptive Troubleshooting) • All network components should follow established standards: – Desktop configuration (maybe several for different classes of users) – HW Manufacturers – OS types and versions – Networking Protocols • Standards should be reviewed/updated quarterly – Will help simplify purchasing and config decisions • Document and publish HW and SW standards 23

Establishing Upgrade Guidelines (Pre-emptive Troubleshooting) • Disruptive updates should occur outside of business hours. • Setup an isolated test system – Server can be small – Software should match production system. • Consider a ‘pilot’ program with a select group of users. • Always formulate a rollback plan, incase things go wrong. 24

Maintain Documentation (Pre-emptive Troubleshooting) • ALL DOCUMENTATION SHOULD BE UP TO DATE AND ON REMOVABLE MEDIA SUCH AS READ/WRITE DVD • Network Address List (for all HW) – MAC address, IP address, physical location • Cable Map – Cable type, Wall-jack #, location, ports on patch panels and concentrators • Contact List – Network administrators, Vendors • Equipment List – Vendor, SN, Purchase date, warranty info. 25

Maintain Documentation - Cont. (Pre-emptive Troubleshooting) • Network History – Log of major changes and problems • Network Map – Hardcopy of server and router config files and protocol info. • Policies and procedures – Operations and troubleshooting • Server Configuration – List of software and drivers, along with config info. 26

Practicing Good Customer-Relation Skills (Pre-emptive Troubleshooting) • Define guidelines for interacting with users • Users are a great source of infromation. • Define questions to ask. 27

Network Monitoring Utilities (Pre-emptive Troubleshooting) • Network Monitoring utilities gather the following information: – Events - alarms – System usage statistics – who used what and when – System performance statistics - throughput • Network Statistics can be used to: – Identify bottlenecks – Trending analysis – Monitor events caused by network changes 28

ISO Pre-emptive Network Managment (Pre-emptive Troubleshooting) • Often implemented in software • Account Management – Record, report network usage • Configuration Management • Fault Management – Detect, isolate network problems • Performance Management – Monitor and analyze network statistics • Security Management 29

Network Baselines (Pre-emptive Troubleshooting) • Network statistics should collected regularly when system is normal • Can be used to observe changes in network usage over time • Can be used to observe changes in network behavior after change in configuration 30

SNMP (Pre-emptive Troubleshooting) • SNMP – Simple Network Management Protocol. – Part of TCP/IP protocol suite – Supported by many vendors • SNMP based Network elements contain: – MIB – Management Information Base – a tree-like database of conif, performance and fault info – Software agent – supports remote queries to the MIB • Allows for remote config and fault managment 31

RMON (Pre-emptive Troubleshooting) • • RMON = Remote Monitoring Extends SNMP Supports multiple MIB types RMON 1 - designed to monitor OSI Layers 1 and 2 • RMON 2 - designed to monitor OSI Layers 3 and higher. 32

Approaches to Network Troubleshooting (Troubleshooting) • Different problems require different approaches – Trial and error – Sometimes you can use a similar system as a working model, or you might have to buckle down and research the problem thoroughly • In this section, you learn about different methods and circumstances in which some methods work and others do not 33

Trial and Error (Troubleshooting) • Can be used under the following conditions: – – The system is newly configured (no data can be lost) The system is not attached to a live network You can easily undo changes Other approaches would take considerably more time than a few trial-and-error attempts – There are few possible causes of the problem (helps you make a good educated guess at the solution) – No documentation and other resources are available to draw on to arrive at a solution more scientifically 34

Trial and Error (continued) • If you determine that trial and error is the right approach for your problem, you should follow some guidelines: – Make one change at a time before testing the results – Avoid making changes that might affect the operation of a live network – Document the original settings of HW and SW before making changes – Avoid making a change that can destroy user data unless a known good backup exists – If possible, avoid making changes you can’t undo 35

Solve by Example • Solving by example: process of comparing something that doesn’t work with something that does, and then making modifications to the nonfunctioning item until it performs like model – Easy and fast way to solve a problem; requires no special knowledge or problem-solving skills – General rules to follow • Use only when the working sample has a similar environment as the problem machine • Don’t make configuration changes that cause conflicts • Don’t make changes that could destroy data that cannot be restored 36

The Replacement Method • • Favorite among PC technicians Follow these rules: 1. Narrow list of potentially defective parts down to one or two possibilities 2. Make sure you have the correct part replacement 3. Replace only one part at a time 4. If your first replacement doesn’t fix the problem, reinstall original part before replacing another part 37

Step by Step with the OSI Model 38

The Problem-Solving Process • General framework for approaching problems 1. 2. 3. 4. 5. 6. 7. 8. Determine the problem definition and scope Gather information Consider possible causes Devise a solution Implement the solution Test the solution Document the solution Devise preventive measures 39

The Problem-Solving Process (continued) 40

Step 1: Identify Problem Definition and Scope (Problem-Solving Process) • Understand Scope of Problem – Is anyone else having the same problem? – What about other areas of the building? – Is the problem occurring with all applications or just one? – Does the problem occur on different computers? • Questions above may reveal multiple problems. • Only Solve on e problem at a time 41

Step 2: Gather Information (Problem-Solving Process) • Most of the initial information about a problem comes from users • Know what questions to ask: – Did it ever work? – When did it stop working? – Has anything changed? – Never ignore the obvious – Define how it’s supposed to work 42

Step 3: Consider Possible Causes (Problem-Solving Process) • From symptoms and other information gathered, consider what could be the cause of the problem • Create a checklist of possible things that could cause the problem • This step will probably reveal more information 43

Step 4: Devise a Solution (Problem-Solving Process) • Before devising a solution consider the following: – Is the identified cause of the problem truly the cause, or is it just another symptom of the true cause? – Is there a way to adequately test proposed solution? – What results should the proposed solution produce? – What are the ramifications of the proposed solution for the rest of the network? – Do you need additional help to answer some of these questions? 44

Step 4: Devise a Solution (continued) (Problem-Solving Process) • Before implementing the solution, prepare for the possibility that the solution could make things worse • Depending on the scope of the problem and solution, you might need to do the following: – Save all network device configuration files – Document and back up workstation configurations – Document wiring closet configurations, including device locations and patch cable connections – Conduct a final baseline to compare new and old results if a rollback becomes necessary 45

Step 5: Implement the Solution (Problem-Solving Process) • Design the implementation so that you can stop and test it at critical points • Inform users of your intentions – Give your users time to schedule network downtime • Put the plan into action – Take notes about every change you make 46

Step 6: Test the Solution (Problem-Solving Process) • It’s 3: 00 a. m. and you’re finished with the upgrade. Time to go home, right? – Wrong. It’s time to test your implementation as a whole • Testing should emulate a real-world situation • Test end-to-end connectivity • Put some stress on the network 47

Step 7: Document the Solution (Problem-Solving Process) • Put notes from implementation and testing into a cohesive document • Documentation should include everything pertinent to the problem, such as: – – Problem Definition Solution Implementation Testing 48

Step 8: Devise Preventive Measures (Problem-Solving Process) • Understand root cause of problem (Root Cause Analysis). • Develop procedures that prevent or at result in early detection of root cause. 49

References Tomsho, Tittel, Johnson (2007). Guide to Networking Essentials. Boston: Thompson Course Technology. Odom, Knott (2006). Networking Basics: CCNA 1 Companion Guide. Indianapolis: Cisco Press Wikipedia (n. d. ). OSI Model. Retrieved 09/12/2006 from http: //en. wikipedia. org/wiki/OSI_Model 50