Use of Nagios in Central European ROC Emir
- Slides: 38
Use of Nagios in Central European ROC Emir Imamagic University Computing Centre (SRCE) Croatia Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Overview v v v Motivation Nagios Grid monitoring with Nagios Sensors w Configuration management w GOCDB integration w v v Demo slides Future work Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Motivation v Achieve better availability w v v getting notifications as soon as problem appears Simplify maintenance of grid resources Complex sensor’s dependencies enables isolating the problem w only relevant notifications are issued w v Report generation w v availability, problem history Visualization & management interface Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Nagios Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Nagios v v Open source monitoring system Widely used & actively developed Host and service problems detection and recovery Provides set of basic plugins (sensors) w v easy to develop custom sensors No components required on monitored entities Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Objects v Host physical server, workstation w network device (e. g. switch, router) w other devices connected to network w v Service service running on host w metric associated with the host w v v Service must be associated with host Objects can be aggregated in groups Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Sensor Execution v Per object sensor arguments adaptive monitoring w e. g. timeout w v Per object checking interval each sensor has individual check interval w normal vs. problem check interval w v Per object number of recheck w v determines state type Advanced check scheduling w avoiding server overload Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Notifications v Per object configuration list of contacts w notification period, states & repeat interval w used for authorization w v Contact configuration name and alias w host and service notification period, states & mechanism w email address w pager number w v Notification escalations w if the problem doesn’t get solved notifications escalates to next contact levels Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
States v Host states w v Service states w v Up, Down, Unreachable Ok, Warning, Unknown, Critical State types w soft • object has not been rechecked specified number of times w hard • object has been rechecked specified number of times • object recovers from problem state • causes notifications & event handlers Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Object Hierarchy v Implicit dependency w v service depends on associated host Host hierarchy if parent is not OK, don’t send notifications for children (hosts and services) w Unreachable state w e. g. router is parent for all hosts on specific site w v Service dependency in which cases are check & notifications performed w one host/services can be dependent on multiple hosts/services w Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Dynamic Operations v Modifying monitoring & notification behavior acknowledging problems w enabling/disabling notifications w enabling/disabling active checks w v Executing sensors individual service w all services on single host w v v Scheduling downtimes Achieved via web interface or pipeline Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Web Interface v v v Viewing current information, history and reports Performing dynamic operations Generating reports w v availability, problem trends & history Supports authentication & authorization (AA) w per host/service authorization Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Other Features v Event handling w v enables automatic failure recovery Active vs. passive checks active – controlled by Nagios w passive – submitted by other systems or another Nagios instance w v Distributed deployment multiple Nagios servers w individual instance submits results as passive checks to central w Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Grid monitoring with Nagios Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
History v CRO-GRID Infrastructure since mid 2005 w covers several grid middleware (Globus Toolkit Pre-WS & WS, UNICORE, NWS, etc) w event handlers for automatic recovery w v Monitoring Central European (CE) core services w v Monitoring all CE sites for 1 st line support w v since mid 2006 since September, 2006 Also used for certification w with forced check Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Deployment v v Centralized deployment Single Nagios server deployed @ SRCE URL: http: //cs-egee. srce. hr/nagios Monitoring statistics 65 hosts w 480 services w v Nagios server statistics (last month) Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Supported Node Types Node type Number of services BDII 1 CE 8 LFC 2 MON 3 PROX 2 RB 7 SE 9 VOMS 4 WMS 7 Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Nagios Basic Sensors Sensor Description Used interval check_ftp checks FTP server used for Grid. FTP ping 15 min check_http checks HTTP server used for checking Tomcat on MON and VOMS 15 min check_ldap checks LDAP server for defined base dn used for checking BDII, Globus MDS and Grid. ICE 15 min check_tcp checks defined TCP port used for DPNS ping 15 min Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Developed Sensors Sensor Description Used interval CA distribution checks CA distribution version 1 day Certificate lifetime uses Grid. FTP or HTTPS to fetch server certificate & verifies lifetime 1 day DPNS lists /dpm directory and looks for the remote server's domain 1 hour EDG Broker submits a test job, waits for the job to finish, fetches and verifies the output 1 hour Gatekeeper ping performs authorization only 15 min Gatekeeper hostname executes hostname and verifies the output 1 hour Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Developed Sensors Sensor Description Used interval Gatekeeper LRMS executes command through LRMS 2 hours Grid. FTP transfers file to remote computer and back and compares copies 1 hour LFC lists /grid directory 15 min Match list CE – matches CE against multiple RBs RB – compares number of matches with data from BDII 1 hour My. Proxy creates proxy certificate, gets the proxy info and destroys it 15 min Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Developed Sensors Sensor Description Used interval SRM ping perform SRM ping with glite-srm-ping 15 min SRM transfers file to remote computer and back and compares copies 1 hour VOMS Proxy creates voms proxy for given VO 15 min VOMS Gridmap creates gridmap file for given VO and reports number of users 1 hour WMS same as EDG Broker, uses glite-job-* 1 hour WMProxy delegation delegates proxy to WMProxy 15 min WMProxy same as EDG Broker, uses glite-job-wms-* 1 hour Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Sensor Hierarchy v Host hierarchy w v router @ SRCE is parent to all hosts Parent services lightweight w more frequent (15 min) w v Child services heavyweight & complex w less frequent (1 hour) w v Less overhead on monitored objects! Parent Service Child Service DPNS ping DPNS list Gatekeeper ping Gatekeeper hostname Gatekeeper LRMS Grid. FTP ping Grid. FTP transfer CA Distribution Grid. FTP transfer Tomcat Certificate lifetime SRM ping SRM transfer VOMS Tomcat VOMS Gridmap WMProxy delegation WMProxy Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Complex Sensors v Case when one service (target) depends on another service (mediator) w v v e. g. submitting job through grid scheduler to a specific CE, storing file through LFC to SE Sensor can use any available mediator service We developed Nagios interface for retrieving list of available mediators Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Configuration Management v v GOCDB Static configuration w v BDII w v e. g. nodes which are not in GOCDB, special contacts retrieving site-specific data (e. g. queue names, ports) Commands more site-specific data w e. g. check_ping, check_ldap w Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
GOCDB Integration v Site information w w w v nodes node types site BDII site contact Site Admins (for web interface authorization) Scheduled downtimes w data pulled 3 times a day Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Web Interface v Authentication w v we added certificate-based authentication Authorization admins can perform operations on their own sites only w region admin can perform operation on all sites w super admin can perform global Nagios operations w Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Demo slides Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Future Work v v Further sensor development Passive checks w v using other monitoring systems (e. g. Ganglia, Gstat) Distributed deployment Nagios per region/country w redundant servers w cluster for sensor execution w Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Thank You Questions? Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
Links v v v CE Nagios monitoring site http: //cs-egee. srce. hr/nagios CE Nagios documentation http: //egee. grid. cyfronet. pl/core-services/nagios Nagios official web page http: //www. nagios. org Grid Monitoring WG core group meeting / Use of Nagios in Central European ROC region
- Yahudilikte on emir nedir
- Gökberk özsöker
- Yahudilikte 10 emir
- Reported speech we
- Emir sirage
- Emir otc
- Masak uyum görevlisi
- Emir kipi
- Emir kipi
- Kalsan şart kipi mi
- Ccp emir
- Erman munir
- Nagios tactical overview
- Nagios partners
- Writing nagios plugins
- Supervision active et passive
- Nagios log server demo
- Nagios report generator
- Nagios active directory monitoring
- Install mrtg debian 10
- Nagios revenue
- Nagios rrd
- Nagios check_mysql_query
- Nagios open source
- Nagios performance data graph
- Nagios architecture
- Monitoramento nagios
- Nagios aws
- Nagios warszawa
- Nagios notification_period
- Nagios 教學
- Central european monarchs clash
- Central european lung cancer patient network
- Central european pipeline system
- Chapter 21 section 3 central european monarchs clash
- European central university
- Central european monarchs clash
- Roc nijmegen rooster
- Roc analiza