Keeping Network Monitoring Current using Automated Nagios Configurations

  • Slides: 34
Download presentation
Keeping Network Monitoring Current using Automated Nagios Configurations (WIP) Greg Wickham APAN July 2005

Keeping Network Monitoring Current using Automated Nagios Configurations (WIP) Greg Wickham APAN July 2005

Is the network being monitored correctly? Greg Wickham APAN July 2005

Is the network being monitored correctly? Greg Wickham APAN July 2005

Contents • Background • Monitoring Overview / Requirements • Solution Architecture • Monitoring Verification

Contents • Background • Monitoring Overview / Requirements • Solution Architecture • Monitoring Verification • Conclusion

Contents • Background • Monitoring Overview / Requirements • Solution Architecture • Monitoring Verification

Contents • Background • Monitoring Overview / Requirements • Solution Architecture • Monitoring Verification • Conclusion

Grange. Net Architecture

Grange. Net Architecture

Grange. Net Monitoring Device Types Routers Servers Switches Quantity 6 6 4 16

Grange. Net Monitoring Device Types Routers Servers Switches Quantity 6 6 4 16

Grange. Net Monitoring Device Types Routers Servers Switches Quantity Probes 6 6 4 16

Grange. Net Monitoring Device Types Routers Servers Switches Quantity Probes 6 6 4 16 310 6 7 323

Grange. Net Monitoring Device Types Routers Servers Switches Quantity Probes 6 6 4 16

Grange. Net Monitoring Device Types Routers Servers Switches Quantity Probes 6 6 4 16 310 6 7 323 Nagios Lines (services. cfg) 3172

Grange. Net Monitoring (ACT Edge) Probe Types Fan Hardware Ping Power Temperature Interfaces MSDP

Grange. Net Monitoring (ACT Edge) Probe Types Fan Hardware Ping Power Temperature Interfaces MSDP Peerings BGP Peerings OSPF Total Probes: Quantity Notes 3 17 1 2 1 16 8 15 2 65 (39)

Grange. Net Monitoring (ACT Edge) • Is that everything that can be monitored?

Grange. Net Monitoring (ACT Edge) • Is that everything that can be monitored?

Grange. Net Monitoring (ACT Edge) • Is that everything that can be monitored? No!

Grange. Net Monitoring (ACT Edge) • Is that everything that can be monitored? No!

Grange. Net Monitoring (ACT Edge) • Is that everything that can be monitored? No!

Grange. Net Monitoring (ACT Edge) • Is that everything that can be monitored? No! • What else? – BGP address family peerings • Multicast / Unicast / IPv 6 – Software versions – Hardware versions – Latency (of links) – Usage (of links) –…

Contents • Background • Monitoring Overview / Requirements • Solution Architecture • Monitoring Verification

Contents • Background • Monitoring Overview / Requirements • Solution Architecture • Monitoring Verification • Conclusion

Monitoring Solution • Solution Goals: – Verifying network is correctly monitored – Minimise replication

Monitoring Solution • Solution Goals: – Verifying network is correctly monitored – Minimise replication of data – Simplistic integration with existing systems – Easy to maintain – Extensible – Flexible – Efficient

Monitoring Overview • Facts: – Networks change – Updating is tedious – Monitoring Difficult

Monitoring Overview • Facts: – Networks change – Updating is tedious – Monitoring Difficult to Auditing • Answers Required: – Is the network performing optimally? – Has a change occurred? – What is the status of the network? – Is the monitoring accurate?

Contents • Background • Monitoring Overview / Requirements • Solution Architecture • Monitoring Verification

Contents • Background • Monitoring Overview / Requirements • Solution Architecture • Monitoring Verification • Conclusion

Solution Architecture Monitoring Configuration • Configuration data stored as XML • Describes: • Devices

Solution Architecture Monitoring Configuration • Configuration data stored as XML • Describes: • Devices to monitor • How to monitor • Nagios templates • Device Templates

Solution Architecture Monitoring Configuration Monitoring Daemon • Daemon reads configuration data • Verifies devices

Solution Architecture Monitoring Configuration Monitoring Daemon • Daemon reads configuration data • Verifies devices are monitored correctly • Generates Nagios Configurations • Performs device probes • Runs periodically

Solution Architecture Monitoring Configuration Monitoring Daemon Nagios Configuration Nagios configuration automatically generated by Monitoring

Solution Architecture Monitoring Configuration Monitoring Daemon Nagios Configuration Nagios configuration automatically generated by Monitoring Daemon

Solution Architecture Monitoring Configuration Nagios Configuration Monitoring Daemon Nagios uses configuration supplied by monitoring

Solution Architecture Monitoring Configuration Nagios Configuration Monitoring Daemon Nagios uses configuration supplied by monitoring daemon; Nagios configured to use ‘passive’ checks Nagios Daemon

Solution Architecture Monitoring Configuration Monitoring Daemon Nagios Configuration Nagios Daemon Monitoring daemon queries all

Solution Architecture Monitoring Configuration Monitoring Daemon Nagios Configuration Nagios Daemon Monitoring daemon queries all devices using SNMP; Check device telemetry against known configurations Network Devices

Solution Architecture Monitoring Configuration Monitoring Daemon Nagios Configuration Nagios Daemon Monitoring daemon sends Probe

Solution Architecture Monitoring Configuration Monitoring Daemon Nagios Configuration Nagios Daemon Monitoring daemon sends Probe status direct to Nagios (Nagios running passive checks) Network Devices

Solution Architecture Monitoring Configuration Nagios Daemon Monitoring Daemon e. Mail SMS Network Devices Web

Solution Architecture Monitoring Configuration Nagios Daemon Monitoring Daemon e. Mail SMS Network Devices Web Nagios reports on network health as usual but does no active checking of its own

Solution Architecture Monitoring Configuration Nagios Daemon Monitoring Daemon e. Mail Report Network Devices SMS

Solution Architecture Monitoring Configuration Nagios Daemon Monitoring Daemon e. Mail Report Network Devices SMS Web Report generated of device monitoring comparison

Solution Architecture Monitoring Configuration Nagios Daemon Monitoring Daemon RRDtool Collected data fed to optional

Solution Architecture Monitoring Configuration Nagios Daemon Monitoring Daemon RRDtool Collected data fed to optional sub-systems e. Mail SMS Web Report Network Devices

Solution Architecture • Result – Only one process communicates to all devices Very Efficient

Solution Architecture • Result – Only one process communicates to all devices Very Efficient Query time for 34 devices is < 10 seconds – As only one daemon communicates to the devices the load on each network device is minimised (collected data is distributed as necessary) – As Nagios does less work the monitoring server is less loaded (Nagios is heavy)

Contents • Background • Monitoring Overview / Requirements • Solution Architecture • Monitoring Verification

Contents • Background • Monitoring Overview / Requirements • Solution Architecture • Monitoring Verification • Conclusion

Monitoring Verification • Templates are used to define pre-requisite monitoring probes • Devices are

Monitoring Verification • Templates are used to define pre-requisite monitoring probes • Devices are attached to templates

Monitoring Verification Device Description <device> <alias>edge 1. vic</alias> <address>202. 0. 98. 68</address> … <module

Monitoring Verification Device Description <device> <alias>edge 1. vic</alias> <address>202. 0. 98. 68</address> … <module type="nagios"> <template>ibgp-mesh</template> <template>ebgp-peerings</template> <template>ospf</template> <template>system</template> … <probe type=“ibgp-mesh" description="AS 18062 - edge 1. nsw“ arg=“ 202. 0. 98. 13” /> <probe type=“ebgp-peering" description="AS 64670“ arg=“ 202. 0. 98. 190” /> … </module> </device>

Monitoring Verification Template Description <template name=“ibgp-mesh"> <template>system-health</template> <probe name=“ibgp-mesh" inheirit="bgp-standard"> <attribute type="field">bgp. Peer. State</attribute>

Monitoring Verification Template Description <template name=“ibgp-mesh"> <template>system-health</template> <probe name=“ibgp-mesh" inheirit="bgp-standard"> <attribute type="field">bgp. Peer. State</attribute> <attribute type="notify">gn-noc</attribute> <attribute type="level">level 1 -service</attribute> <match> <field name="bgp. Peer. Remote. As" value="18062" /> <field name="bgp. Peer. State" value="up" /> </match> </probe> </template>

Monitoring Verification • From the: – Device template; and – Monitoring Template an accurate

Monitoring Verification • From the: – Device template; and – Monitoring Template an accurate report can be generated of the status of monitoring. • All probe details are stored in XML so can be easily verified

Contents • Background • Monitoring Overview / Requirements • Solution Architecture • Monitoring Verification

Contents • Background • Monitoring Overview / Requirements • Solution Architecture • Monitoring Verification • Conclusion

Conclusion • Due to efficiencies in the monitoring daemon: – Nagios doesn’t load the

Conclusion • Due to efficiencies in the monitoring daemon: – Nagios doesn’t load the server – Other applications can share the SNMP data – Doesn’t load the network devices – Device probing is very quick • Reduces complexity of Nagios configuration • Generate reports identifying inaccuracies in existing monitoring • Unified configuration data This is a Work in Progress

Status (Work in Progress) • Current functionality: – Separate applications: • Collecting data from

Status (Work in Progress) • Current functionality: – Separate applications: • Collecting data from devices; feed into Nagios • Generating Nagios configurations • To Do – Integrate applications – Complete Implementation Nagios templates – Documentation! • Software – Perl – net-snmp – Nagios