ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


Nagios, Part 2
Pages: 1, 2, 3



Contents of contacts.cfg

define contact{
 contact_name                    oktay
 alias                           Oktay Altunergil
 service_notification_period     24x7
 host_notification_period        24x7
 service_notification_options    w,u,c,r
 host_notification_options       d,u,r
 service_notification_commands   notify-by-email,notify-by-epager
 host_notification_commands      host-notify-by-email,host-notify-by-epager
 email                           oktay@example.com
 pager                           dummypagenagios-admin@localhost.localdomain
 }

define contact{
 contact_name                    Verty
 alias                           David 'Verty' Ky
 service_notification_period     24x7
 host_notification_period        24x7
 service_notification_options    w,u,c,r
 host_notification_options       d,u,r
 service_notification_commands   notify-by-email,notify-by-epager
 host_notification_commands      host-notify-by-email
 email                           verty@example.com
 }

In addition to providing contact details for a particular user, the 'contact_name' in the contacts.cfg is also used by the cgi scripts (i.e the Web interface) to determine whether a particular user is allowed to access a particular resource. Although you will need to configure .htaccess based basic http authentication in order to be able to use the Web interface, you still need to define those same usernames as seen above, before the users can access any of the resources even after they are logged in with their username and passwords. Now that we have our hosts and contacts configured, we can start configuring individual services on our server to be monitored.

Contents of services.cfg

# Generic service definition template
define service{
 # The 'name' of this service template, referenced in other service definitions
 name    generic-service  
 # Active service checks are enabled
 active_checks_enabled  1 
 # Passive service checks are enabled/accepted
 passive_checks_enabled  1 
 # Active service checks should be parallelized 
 # (disabling this can lead to major performance problems)
 parallelize_check  1  
 # We should obsess over this service (if necessary)
 obsess_over_service  1  
 # Default is to NOT check service 'freshness'
 check_freshness   0  
 # Service notifications are enabled
 notifications_enabled  1 
 # Service event handler is enabled
 event_handler_enabled  1 
 # Flap detection is enabled
 flap_detection_enabled  1 
 # Process performance data
 process_perf_data  1 
 # Retain status information across program restarts
 retain_status_information 1  
 # Retain non-status information across program restarts
 retain_nonstatus_information 1  
 # DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
 register   0
 }

# Service definition
define service{
 # Name of service template to use
 use    generic-service   

 host_name   example.com 
 service_description  HTTP
 is_volatile   0
 check_period   24x7
 max_check_attempts  3
 normal_check_interval  5
 retry_check_interval  1
 contact_groups   flcd-admins
 notification_interval  120
 notification_period  24x7
 notification_options  w,u,c,r
 check_command   check_http
 }


# Service definition
define service{
 # Name of service template to use
 use    generic-service   

 host_name   example.com
 service_description  PING
 is_volatile   0
 check_period   24x7
 max_check_attempts  3
 normal_check_interval  5
 retry_check_interval  1
 contact_groups   flcd-admins
 notification_interval  120
 notification_period  24x7
 notification_options  c,r
 check_command   check_ping!100.0,20%!500.0,60%
 }

Using the above setup, we are configuring two services to be monitored. The first service definition, which we have called HTTP, will be monitoring whether the Web server is up and notifies us if there's a problem. The second definition monitors the ping statistics from the server and notifies us if the response time increases too much and if there's too much packet loss which is a sign of network trouble. The commands we use to accomplish this are 'check_http' and 'check_ping' which were installed into the 'libexec' directory when we installed the plugins. Please take your time to get familiar with all other plugins that are available and configure them similarly to the above definitions. You can also write your own plugins to do custom monitoring. For instance, there's no plugin to check if Tomcat is up or down. You could simply write a script that loads a default jsp page on a remote Tomcat server and returns a success or failure status based on the presence or lack of a predefined text value (i.e "Tomcat is up") on the page. (In such a case you would need to add a definition for this custom command in your checkcommand.cfg file which we have not touched)

Pages: 1, 2, 3

Next Pagearrow





Sponsored by: