Contents of contacts.cfg
define contact{
contact_name oktay
alias Oktay Altunergil
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,u,r
service_notification_commands notify-by-email,notify-by-epager
host_notification_commands host-notify-by-email,host-notify-by-epager
email oktay@example.com
pager dummypagenagios-admin@localhost.localdomain
}
define contact{
contact_name Verty
alias David 'Verty' Ky
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,u,r
service_notification_commands notify-by-email,notify-by-epager
host_notification_commands host-notify-by-email
email verty@example.com
}
In addition to providing contact details for a particular user, the 'contact_name' in the contacts.cfg is also used by the cgi scripts (i.e the Web interface) to determine whether a particular user is allowed to access a particular resource. Although you will need to configure .htaccess based basic http authentication in order to be able to use the Web interface, you still need to define those same usernames as seen above, before the users can access any of the resources even after they are logged in with their username and passwords. Now that we have our hosts and contacts configured, we can start configuring individual services on our server to be monitored.
Contents of services.cfg
# Generic service definition template
define service{
# The 'name' of this service template, referenced in other service definitions
name generic-service
# Active service checks are enabled
active_checks_enabled 1
# Passive service checks are enabled/accepted
passive_checks_enabled 1
# Active service checks should be parallelized
# (disabling this can lead to major performance problems)
parallelize_check 1
# We should obsess over this service (if necessary)
obsess_over_service 1
# Default is to NOT check service 'freshness'
check_freshness 0
# Service notifications are enabled
notifications_enabled 1
# Service event handler is enabled
event_handler_enabled 1
# Flap detection is enabled
flap_detection_enabled 1
# Process performance data
process_perf_data 1
# Retain status information across program restarts
retain_status_information 1
# Retain non-status information across program restarts
retain_nonstatus_information 1
# DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
register 0
}
# Service definition
define service{
# Name of service template to use
use generic-service
host_name example.com
service_description HTTP
is_volatile 0
check_period 24x7
max_check_attempts 3
normal_check_interval 5
retry_check_interval 1
contact_groups flcd-admins
notification_interval 120
notification_period 24x7
notification_options w,u,c,r
check_command check_http
}
# Service definition
define service{
# Name of service template to use
use generic-service
host_name example.com
service_description PING
is_volatile 0
check_period 24x7
max_check_attempts 3
normal_check_interval 5
retry_check_interval 1
contact_groups flcd-admins
notification_interval 120
notification_period 24x7
notification_options c,r
check_command check_ping!100.0,20%!500.0,60%
}
Using the above setup, we are configuring two services to be monitored. The first service definition, which we have called HTTP, will be monitoring whether the Web server is up and notifies us if there's a problem. The second definition monitors the ping statistics from the server and notifies us if the response time increases too much and if there's too much packet loss which is a sign of network trouble. The commands we use to accomplish this are 'check_http' and 'check_ping' which were installed into the 'libexec' directory when we installed the plugins. Please take your time to get familiar with all other plugins that are available and configure them similarly to the above definitions. You can also write your own plugins to do custom monitoring. For instance, there's no plugin to check if Tomcat is up or down. You could simply write a script that loads a default jsp page on a remote Tomcat server and returns a success or failure status based on the presence or lack of a predefined text value (i.e "Tomcat is up") on the page. (In such a case you would need to add a definition for this custom command in your checkcommand.cfg file which we have not touched)