Installing and checking /proc/mount using Nagios Plugins on CentOS 5

Most of this blog entry material is taken from “Checking /proc/mounts on remote server” from Nagioswiki . The Nagios version is 3.x and the OS Platform CentOS 5.x

We will basically require 2 Nagios Plugins “check_disk” and “check_nrpe” plugins to use this excellent nrpe plugins

On the Remote Server, install the

# yum install nagios-nrpe nagios-plugins-disk

On the Remote Server, go to nagios configuration file and the command inside nrpe.cfg

# vim /etc/nagios/nrpe.cfg
command[check_disks_proc_mounts]=/usr/lib/nagios/plugins/check_disk -w 15% -c 10% $(for x in $(cat /proc/mounts |awk '{print $2}')\; do echo -n " -p $x "\; done)

On the Nagios Server,

Ensure you have the check_nrpe plugins inside. and test the plugins

# yum install nagios-nrpe
# cd /usr/lib64/nagios/plugins
# ./check_nrpe -H monitored-server -c check_disks_proc_mounts

DISK OK - free space: / 28106 MB (53% inode=98%); /boot 81 MB (86% inode=99%);
/dev/shm 1887 MB (100% inode=99%);| /=24543MB;47188;49964;0;55516
/boot=12MB;83;88;0;98 /dev/shm=0MB;1603;1698;0;1887

Add the following definition in your commands.cfg file

define  command {
        command_name    check_nrpe_disk_procs
        command_line    $USER1$/check_nrpe -H $HOSTNAME$ -c check_disks_proc_mounts -t 20
        }

Add the following sort of host check (assuming, of course, that your host is already in your config)

define service{
        use                    local-service
        host_name              monitored_server
        service_description    check_disk on proc mounts
        check_command          check_nrpe_disk_procs
}

Horray it is done.

Incorporating PNP 0.4.x (PNP is not Perfparse) with Nagios 3 and CentOS 5

This blog entry is taken in part from the  Integrating PNP (PNP is not Perfparse) with CentOS 4.x / Nagios 2.x from NagiosWiki and the book Nagios 2nd Edition from No starch Press.

1. What is PNP4Nagios?
PNP4Nagios (English) is an addon to nagios which analyzes performance data provided by plugins and stores them automatically into RRD-databases

2. Which version will you be covering?
I’ll be using on the pnp4nagios 0.4.x which fit into CentOS 5.x quite well as it does not need to incorporate additional newer components which might break existing dependencies.
Download the pnp4nagios 0.4x from the download website

3. What prerequisites I need?
Install rrdtools
Make sure you have the RPMForge Repository installed. For more information, get more information at LinuxToolkit (Red Hat Enterprise Linux / CentOS Linux Enable EPEL (Extra Packages for Enterprise Linux) Repository).

# yum install rrdtool


4. Download and configure

# wget http://sourceforge.net/projects/pnp4nagios/files/PNP/pnp-0.4.14/pnp-0.4.14.tar.gz/download
# tar -zxvf pnp-0.4.14.tar.gz # cd pnp-0.4-14
# ./configure --sysconfdir=/etc/pnp --prefix=/usr/share/nagios

(For more configuration of ./configure, see ./configure –help)
Output is as followed:

*** Configuration summary for pnp 0.4.14 05-02-2009 ***
General Options:
------------------------- -------------------
Nagios user/group:  nagios nagios 
Install directory:  /usr/share/nagios 
HTML Dir:           /usr/share/nagios/share
Config Dir:          /etc/pnp 
Location of rrdtool binary: /usr/bin/rrdtool Version 1.4.4
RRDs Perl Modules:  FOUND (Version 1.4004)
RRD Files stored in:   /usr/share/nagios/share/perfdata 
process_perfdata.pl Logfile: /usr/share/nagios/var/perfdata.log
Perfdata files (NPCD) stored in: /usr/share/nagios/var/spool/perfdata/

Review the options above for accuracy. If they look okay,
type 'make all' to compile.
# make all
# make install

If there are failure for make or make install, you may not have installed gcc-c++ tool to compile.

# yum install gcc-c++

Create soft links

# ln -s /usr/share/nagios/share /usr/share/nagios/pnp


5. Passing Performance Data to the PNP data collector.
To switch on the performance data processing

# PROCESS PERFORMANCE DATA OPTION
# This determines whether or not Nagios will process performance
# data returned from service and host checks.  If this option is
# enabled, host performance data will be processed using the
# host_perfdata_command (defined below) and service performance
# data will be processed using the service_perfdata_command (also
# defined below).  Read the HTML docs for more information on
# performance data.
# Values: 1 = process performance data, 0 = do not process performance data

process_performance_data=1
# HOST AND SERVICE PERFORMANCE DATA PROCESSING COMMANDS
# These commands are run after every host and service check is
# performed.  These commands are executed only if the
# enable_performance_data option (above) is set to 1.  The command
# argument is the short name of a command definition that you
# define in your host configuration file.  Read the HTML docs for
# more information on performance data.

host_perfdata_command=process-host-perfdata
service_perfdata_command=process-service-perfdata

5a. Switching on the process-service-perfdata

(From inside Nagios configuration directory usually /etc/nagios/)

# cd objects
# vim commands.cfg
define command {
  command_name    process-service-perfdata
  command_line    /usr/bin/perl /usr/share/nagios/libexec/process_perfdata.pl
}

define command {
  command_name    process-host-perfdata
  command_line    /usr/bin/perl /usr/share/nagios/libexec/process_perfdata.pl -d HOSTPERFDATA
}

6. Final check

# cd /etc/nagios
# nagios -v nagios.cfg
(Check for any error. If there is no error, restart the service)
# service nagios restart
(Restart httpd)
# service httpd restart

7. Take a look at the graph!

http://YourServerIPAddres.org/nagios/share/

Installing Check Disk IO Plugins via NRPE on CentOS 5.x

This Blog entry is modified from Check Disk IO via NRPE from Nagios Wiki

1. What is check_diskio?
check_diskio is a simple Nagios plugin for monitoring disk I/O on Linux 2.4 and 2.6 systems.

2. Where do I get information and download check_diskio?
Got to http://freshmeat.net/projects/check_diskio/

3. Installation Guide
A. Ensure you install the perl package. You will need perl 5.8.x and required modules. You need to install RPMforge Repository (From Linux Toolkit Blog)

At the Client Machine.

# yum install perl
# tar -zxvf check_diskio-3.2.2.tar.gz
# cd check_diskio-3.2.2
# less INSTALL (Read the INSTALL Readme file)
# perl Makefile.PL INSTALLSCRIPT=/usr/lib64/nagios/plugins
(You will see a list of warnings of prerequisites)

B. Install the prerequisites Perl-Modules. Here may not be completed lists.

# yum install perl-Nagios*
# yum install perl-List*
# yum install perl-Readonly*
# yum install perl-Number-Format
# yum install perl-File-Slurp*
# yum install perl-Array-Unique

C. Finally compile

# perl Makefile.PL INSTALLSCRIPT=/usr/lib64/nagios/plugins
# make
# make install
(You will see the check_diskio at /usr/lib64/nagios/plugins)

D. Edit the nrpe.cfg file on the client machine. If you have not download nagios nrpe plugins, just do a

# yum install nagios-nrpe

D1. Edit /etc/nagios/nrpe.cfg on the client machine,

# vim /etc/nagios/nrpe.cfg
command[check_diskio]=/usr/lib64/nagios/plugins/check_diskio --device=/dev/sda -w 200 -c 300

At the Server, just make sure you install the nagios nrpe plugins like D. Finally to ensure that the remote server plugins are ok, do a test at Nagios Server

# cd /usr/lib64/nagios/plugins
# ./check_nrpe -H remote-server -c check_diskio
CHECK_DISKIO OK - sda 194 sectors/s | WRITE=194;200;300 READ=0;200;300 TOTAL=194;200;300

Horray It is done!

Installing and configuring Ganglia on CentOS 5.4

Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. Ganglia will help you to determine if there are any trends that might be causing a hardware under capacity, runaway process etc. Ganglia requires very little CPU, memory and network resources to run. According to Ganglia official website, it can scale easily to 2000 nodes.

For the Blog Entry, I’m assuming you are building a HPC Cluster with a Head Node and several Compute Nodes.

Ganglia has 2 daemond gmetad and gmond. It also require other prerequsites such as PHP, RRDtool, Apache. First-thing-first

  1. gmond – Ganglia monitoring daemon. Gmond job is to gather performance metrics and keep track of the status of othe gmond running in the cluster. If one gmond daemond fail due to failure of the nodes, all remaining gmond knows about it. It is required on every node.
  2. gmetad – gmetad is only needed to run on the cluster head node. Its job is to consolidate and poll the gmond daemonds for the performance metric information every 15 seconds and store the information in the RRDtool round-robin database (In a round-robin database, the database never fills up as the newest data will override the older data.). Finally, it displays the information on to the Apache Web server
  3. The Ganglia Web package require PHP on the cluster head node to display the information on Apache

Part I: To install ganglia on CentOS 5.4 on the Cluster Head Node, do the followings:

  1. Make sure you have the RPMForge Repository installed. For more information, get more information at LinuxToolkit (Red Hat Enterprise Linux / CentOS Linux Enable EPEL (Extra Packages for Enterprise Linux) Repository)
  2. # yum install rrdtool ganglia ganglia-gmetad ganglia-gmond ganglia-web httpd php
  3. However at this point in writing, you might got the followings “Error: Missing Dependency: rrdtool = 1.2.27-3.el5 is needed by package rrdtool-perl“. To resolve the issue, you may want to look at LinuxToolkit (Error: Missing Dependency: librrd.so.2()(64bit) is needed by package ganglia-gmetad (epel)).
  4. By default, Ganglia uses multi-cast or UDP to pass information. I refer to use UDP as I can have better control
  5. Assuming 192.168.1.5 is our head node and port number 8649. Edit /etc/gmond.conf and start the gmond service.
    cluster {
    name = "My Cluster"
    owner = "kittycool"
    latlong = "unspecified"
    url = "unspecified"
    }
    udp_send_channel {
    host = 192.168.1.5
    port = 8649
    ttl = 1
    }
    udp_recv_channel {
    port = 8649
    }
  6. Configure the service level startup-up  and start the service for gmond
    chkconfig --levels 235 gmond on
    service gmond start
  7. Configure the /etc/gmetad.conf to define the datasource
    data_source "my cluster" 192.168.1.5:8649
  8. Configure the service level startup-up  and start the service for gmetad
    chkconfig --levels 235 gmetad on
    service gmetad start
  9. Configure the service level startup-up  and start the service for httpd
    chkconfig --levels 235 httpd on
    service httpd start

Part II: To install on the Compute Nodes, I’m assuming the Compute Node are on private network and does not have access to the internet. Only the Head Node has access to internet. I’m also assuming there is no routing from the compute nodes via the head node for internet access

  1. Read the following blog  Using yum to download the rpm package that have already been installed.
  2. Copy the rpm to all the compute nodes
  3. Install the package on each compute nodes*
    yum install ganglia-gmond
  4. Configure the service startup for each compute nodes*
    chkconfig --levels 235 gmond on
  5. Copy the gmond /etc/gmond.conf configuration file from head node to the compute node.

Part III: If  you wish to create custom metrics that is not included in the standard Ganglia Distribution, you can write your own performance monitoring scripts to report on the gmond running on the compute nodes with gmetric. To find sample gmetric scripts, you can find from

  1. Gmetric Script Repository. For example, you can use gemetric for NFS script “Linux NFS client GETATTR, READ and WRITE calls“.

Part IV: Using Command line gstat

  1. You can use gstat to list information about the cluster nodes.  Some useful commands are:
  2. # gstat -h
    (To show help for all commands)
  3. # gstat -a
    (List all nodes)
  4. # gstat -l
    (Print ONLY the host list)