Graphite – highly scalable real-time graphing system

Graphite is an interesting project. If you wish to take a look at the project a bit deeper. The official Graphite Documentation is very comprehensive.

But some pointers could be useful.

Point 1: What is Graphite?

Graphite is a highly scalable real-time graphing system. As a user, you write an application that collects numeric time-series data that you are interested in graphing, and send it to Graphite’s processing backend, carbon, which stores the data in Graphite’s specialized database. The data can then be visualized through graphite’s web interfaces.

Graphite 1.2.0 Documentation

Point 2: Architecture

Graphite consists of 3 software components:

  1. carbon – a Twisted daemon that listens for time-series data
  2. whisper – a simple database library for storing time-series data (similar in design to RRD)
  3. graphite webapp – A Django webapp that renders graphs on-demand using Cairo

Point 3: Who should be using Graphite?

Anybody who would want to track values of anything over time. If you have a number that could potentially change over time, and you might want to represent the value over time on a graph, then Graphite can probably meet your needs.

Specifically, Graphite is designed to handle numeric time-series data. For example, Graphite would be good at graphing stock prices because they are numbers that change over time. Whether it’s a few data points, or dozens of performance metrics from thousands of servers, then Graphite is for you. As a bonus, you don’t necessarily know the names of those things in advance (who wants to maintain such huge configuration?); you simply send a metric name, a timestamp, and a value, and Graphite takes care of the rest!

Graphite 1.2.0 Documentation

Point 4: Tools

Ganglia, a tool used by many High Performing Cluster (HPC) worldwide can be integrated with Graphite. Other tools that work with Graphite can be found here

Point 5: Get the book…..

3 Tenets of Monitoring and Approach to IT Monitoring

I read the book Monitoring with Graphite by Oreilly. Please read the book further. It is a good read. I’m just pending my own thoughts.

He mentioned something that is quite interesting that I have not really thought of. This can be divided into 3 main categories:

  1. Fault Detection
  2. Alerting
  3. Capacity Planning

Fault Detection

Fault Detection is to identify when a resource becomes unavailable or starts to perform poorly. Traditionally, system administrators employ thresholds to recognise the delta in a system’s behaviour

Alerting

Alerting constitutes the moment the monitoring system identifies a fault, the recipient(s) is alerted through som means perhaps like email, SMS so that further actions can be taken by the recipient(s)

Capacity Planning

The act of capacity planning is the ability to study trends in the data and use that knowledge make informed decisions about adding capacity now or in the near future. You can use Graphite to work on the time-series data

Pull and Push Model

Pull Model – The Traditional Approach to IT Monitoring centers around a polling agent spending resources to connect to remote users or appliances to determine their current status. However, traditional method of pull method have limitation in integrating trending and monitoring and often different software stacks is required.

Push Model – Metrics are pushed from the sources to a unified storage repository, and providing with a consolidated set of data to drive both IT responses and business decisions. The advantage is that collection tasks are decentralised and we no longer require to scale our collection system horizontally as the architecture scale vertically. One of the interesting aspects of the push model is that we can isolate the functional responsibilities of the monitoring system.

Using iostat to report system input and output

The iostat command is used to monitor system input/output device loading by observing the time the device are active in relation to their average transfer rate.

For disk statistics

# iostat -m 2 10 -x /dev/sda1

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
0.31    0.00    0.50    0.19    0.00   99.00

Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda1              0.00    30.00  0.00 14.00     0.00     0.17    25.14     0.85   60.71   5.50   7.70

where

“-m” = Display statistics in megabytes per second
“2 10” = 2 seconds for 10 times
-x =  Display  extended  statistics.

AVG-CPU Statistics

  • “%user” = % of CPU utilisation that occurred while executing at the user level (application)
  • “%nice” = % of CPU utilisation that occurred while executing at the user level with nice priority
  • “%system” = % of CPU utilisation that occurred while executing at the system level (kernel)
  • “%iowait” = % of time CPU were idle during which the system had an outstanding disk I/O request
  • “%steal” = % of time spent in involuntary wait by the virtual CPU or CPUs while the hypervisor was servicing another virtual processor.
  • “%idle” = % of time that the CPU or CPUS were idle and the system does not have an outstanding disk I/O request

DEVICE Statistics

  • “rrqm/s” =  The number of read requests merged per second  that  were queued to the device.
  • “wrqm/s” = The  number of write requests merged per second that were queued to the device
  • “r/s” =  The number of read  requests  that  were  issued  to  the device per second.
  • “w/s” = The number of write requests that was issued to the device per second.
  • “rMB/s” = The number of megabytes read from the device per  second.
  • “wMB/s” = The number of megabytes written to the device per second.
  • “avgrq-sz” = The average size (in sectors) of the requests  that  were issued to the device.
  • “avgqu-sz” = The average queue length of the requests that were issued to the device.
  • “await” = The average  time  (in  milliseconds)  for  I/O  requests issued to the device to be served.
  • “svctm” = This field will be depreciated
  • “util” = Percentage of CPU time during  which  I/O  requests  were issued  to  the  device  (bandwidth  utilization  for the device).

Other usage of IOSTAT

1. Total Device Utilisation in MB

# iostat -d -m
Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda              53.72         0.01         0.24     330249    8097912
sda1             53.66         0.01         0.24     307361    7997449
sda2              0.06         0.00         0.00      22046     100253
sda3              0.01         0.00         0.00        841        208
hda               0.00         0.00         0.00       1200          0
  • tps = transfer per second. A transfer is an I/O request to the device
  • MB_read = The total number of megabytes read
  • MB_wrtn = The total number of megabytes written
  • MB_read/s = Indicate the amount of data read from the device expressed in megabytes per second.
  • MB_wrtn/s = Indicate the amount of data written to the device expressed in megabytes per second.
# iostat -n -m
.....
.....
Device:              rMB_nor/s wMB_nor/s rMB_dir/s wMB_dir/s rMB_svr/s wMB_svr/s ops/s rops/s wops/s
NFS_Server:/vol/vol1 0.26      0.16      0.00      0.00      0.17      0.16      14.82 2.96   3.49
  • ops/s = Indicate the number of operations that were issued to the mount point per second
  • rops/s = Indicate the number of read operations that was issued to the mount point per second
  • wops/s = Indicate the number of write operations that were issued to the mount point per second

Installing and Configuring Ganglia on CentOS 5.8

For a basic understanding on ganglia more general information, do look at Installing and configuring Ganglia on CentOS 5.4

Part I: To install ganglia on CentOS 5.8 on the Cluster Head Node, do the followings:

  1. Make sure you have the RPM Repositories installed. For more information, see Useful Repositories for CentOS 5
  2. Make sure installed the libdbi-0.8.1-2.1.x86_64.rpm for CentOS 5.9 be installed on the CentOS 5.8. Apparently, there is no conflict or dependency issues. See RPM Resource libdbi.so.0()(64bit)
    # wget 
    ftp://rpmfind.net/linux/centos/5.9/os/x86_64/CentOS/libdbi-0.8.1-2.1.x86_64.rpm
    # rpm -ivh libdbi-0.8.1-2.1.x86_64.rpm
  3. Install PHP 5.4. See Installing PHP 5.4 on CentOS 5
  4. Install the Ganglia Components
    # yum install rrdtool ganglia ganglia-gmetad ganglia-gmond ganglia-web httpd
  5. By default, Ganglia uses multi-cast or UDP to pass information. I refer to use UDP as I can have better control
  6. Assuming 192.168.1.5 is our head node and port number 8649. Edit /etc/ganglia/gmond.conf and start the gmond service.
    cluster {
    name = "My Cluster"
    owner = "kittycool"
    latlong = "unspecified"
    url = "unspecified"
    }
    udp_send_channel {
    host = 192.168.1.5
    port = 8649
    ttl = 1
    }
    udp_recv_channel {
    port = 8649
    }
  7. Configure the service level startup-up and start the service for gmond
    chkconfig --levels 235 gmond on
    service gmond start
  8. Configure the /etc/ganglia/gmetad.conf to define the datasource
    Data_source "my cluster" 192.168.1.5:8649
  9. Configure the service level startup-up and start the service for gmetad
    chkconfig --levels 235 gmetad on
    service gmetad start
  10. Configure the service level startup-up and start the service for httpd
    chkconfig --levels 235 httpd on
    service httpd start

Part II: To install on the Compute Nodes, I’m assuming the Compute Node are on private network and does not have access to the internet. Only the Head Node has access to internet. I’m also assuming there is no routing from the compute nodes via the head node for internet access

  1. Read the following blog  Using yum to download the rpm package that have already been installed.
  2. Copy the rpm to all the compute nodes
  3. Install the package on each compute nodes*
    yum install ganglia-gmond
  4. Configure the service startup for each compute nodes*
    chkconfig --levels 235 gmond on
  5. Copy the gmond /etc/ganglia/gmond.conf configuration file from head node to the compute node.

For more information, do look at

  1. Installing and configuring Ganglia on CentOS 5.4

Brief overview of Valgrind usage

This write-up covers some very basis commands. But I will try to list out some of the other collections of tutorial and reading to complement this lack of information. I’m assuming that you have compiled the program as written in Compiling Valgrind on CentOS 5

One of the most commonly used command in Valgrind is

# valgrind --tool=memcheck --leak-check=full ./my_program

Commonly-used Options

 S/No Command Option Description
 1  –leak-check=<no|summary|yes|full> [default: summary]  When enabled, search for memory leaks when the client program finishes. If set to summary, it says how many leaks occurred. If set to full or yes, it also gives details of each individual leak.
 2   –show-reachable=<yes|no> [default: no]  When disabled, the memory leak detector only shows “definitely lost” and “possibly lost” blocks. When enabled, the leak detector also shows “reachable” and “indirectly lost” blocks. (In other words, it shows all blocks, except suppressed ones)

For more information on more details usage of Valgrind of options and how to use,

  1. Valgrind Manual – 4.3 Memcheck Command Options
  2. Using Valgrind to Find Memory Leaks and Invalid Memory Use
  3. Using Valgrind to debug memory leaks

Compiling Valgrind on CentOS 5

Valgrind tools automatically detect many memory management and threading bugs, and is able to profile your programs in detail. It runs on the following platforms: X86/Linux, AMD64/Linux, ARM/Linux, PPC32/Linux, PPC64/Linux, S390X/Linux, ARM/Android (2.3.x), X86/Darwin and AMD64/Darwin (Mac OS X 10.6 and 10.7)

According to Valgrind, a number of useful tools are supplied as standard.

  1. Memcheck is a memory error detector. It helps you make your programs, particularly those written in C and C++, more correct.
  2. Cachegrind is a cache and branch-prediction profiler. It helps you make your programs run faster.
  3. Callgrind is a call-graph generating cache profiler. It has some overlap with Cachegrind, but also gathers some information that Cachegrind does not.
  4. Helgrind is a thread error detector. It helps you make your multi-threaded programs more correct.
  5. DRD is also a thread error detector. It is similar to Helgrind but uses different analysis techniques and so may find different problems.
  6. Massif is a heap profiler. It helps you make your programs use less memory.
  7. DHAT is a different kind of heap profiler. It helps you understand issues of block lifetimes, block utilisation, and layout inefficiencies.
  8. SGcheck is an experimental tool that can detect overruns of stack and global arrays. Its functionality is complementary to that of Memcheck: SGcheck finds problems that Memcheck can’t, and vice versa..
  9. BBV is an experimental SimPoint basic block vector generator. It is useful to people doing computer architecture research and development.

Compilation of Valgrind

Compilation is very straightforward……

# tar -xvjpf valgrind-3.7.0.tar.bz2
# cd valgrind-3.7.0
# ./configure --prefix=/usr/local/valgrind-3.7.0
# make; make install

Testing Valgrind

# /usr/local/valgrind-3.7.0/bin/valgrind ls -l

Either this works, or it bombs out with some complaint.

Using strace as a troubleshooting tool

Strace, when runs in conjunction with a program do output all the calls made to the kernel by the program.

One of quick way to found out what is going on in your program is to do

$ strace -c ./my_hello_world_program
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
74.80    0.002998        1499         2           wait4
21.91    0.000878           4       221           read
0.95    0.000038           0       237         2 mmap
0.77    0.000031          10         3         1 mkdir
0.67    0.000027           0       566       361 open
0.35    0.000014           0        81           mprotect
0.30    0.000012           0        62        37 stat
0.25    0.000010           0       225           close
0.00    0.000000           0        37         1 write
0.00    0.000000           0       132           fstat
0.00    0.000000           0         8           poll
0.00    0.000000           0         2           lseek
0.00    0.000000           0       120           munmap
0.00    0.000000           0        15           brk
0.00    0.000000           0        16           rt_sigaction
................

................

------ ----------- ----------- --------- --------- ----------------
100.00    0.004008                  1990       411 total

If you wish to do a tracing, just do a, you can easily find out the error if there was….

$ strace ./my_hello_world_program
............

............

open("/tmp/openmpi-sessions-root@starfruit-h00.cluster.spms.ntu.edu.sg_0/25979/1/0",
O_RDONLY|O_NONBLOCK|O_DIRECTORY) = -1 ENOENT (No such file or directory)
munmap(0x2b46e05ef000, 2111200)         = 0
munmap(0x2b46dffe5000, 2102312)         = 0
munmap(0x2b46dfdde000, 2123264)         = 0
munmap(0x2b46e103f000, 2106960)         = 0
munmap(0x2b46e1242000, 2104560)         = 0
munmap(0x2b46e269d000, 2114912)         = 0
munmap(0x2b46e41c9000, 2145008)         = 0
munmap(0x2b46e43d5000, 2162608)         = 0

If you wish the output of strace to a file instead, do use the argument -o

$ strace -o strace_output_file ./my_hello_world_program

If you wish to trace system call, process,network, you can use the “-e trace=file”, “-e trace=process”, “-e trace=network”,

$ strace -e trace=open,close,read,write ./my_hello_w0rld_program
$ strace -e trace=stat,chmod,unlink ./my_hello_world_program

Further Information:

  1. Solutions for tracing UNIX applications (IBM DeveloperWorks)
  2. strace – A very powerful troubleshooting tool for all Linux users (linuxhelp.blogspot.com)
  3. Ten commands every linux developer should know (Linux Journal)

Basic Overview and use of NMON on CentOS 5

nmon for Linux – Nigel’s performance Monitor for Linux is a wonderful Swiss Army Knife for Performance Information.You can display multiple screen on the same windows and get information on CPU, Memory, NFS, Network, Disks, Resource, kernel etc

nmon has single binaries for each operating system including Red Hat, SUSE, Ubuntu, OpenSUSE, Fedora etc. Using the binary is as simple as starting the executable like

$ ./nmon_x86_64_rhel54

Using nmon in basic mode. For more information, do read the nmon for Linux Getting Started for more details

  1. To quit, just hit “q”
  2. Most of the rest are toggled commands i.e. hit c to see the CPU stats and hit c again to remove CPU stats.
  3. For disk graphs hit d and you will see a 50 column graph of the read and write busy percentages
  4. For disk numbers hit D and if you hit D again you see different information eventually hitting D will close this section

Using nmon for Linux in data capture mode

  1. Capturing a small sample file: nmon -f -s 2 -c 30
  2. -f means the data will be saved and not displayed on the screen
  3. -s 2 means data capture every 2 seconds
  4. -c 30 means 30 data points or snap shots
  5. Do note that the nmon runs like a daemon process in the background. nmon will continue to run till completion whether you connect or log-off.
  6. You can check whether the nmon is running by “ps -ef | grep nmon”
  7. Resulting file is xxx.nmon

Using mpstat to display SMP CPU statistics

mpstat is a command-line utilities to report CPU related statistics.

For CentOS, to install mpstat, you have to install the sysstat package (http://sebastien.godard.pagesperso-orange.fr/)

# yum install sysstat

1. mpstat is very straigtforward. Use the command below. On my 32-core machine,

# mpstat -P ALL
11:10:11 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
11:10:13 PM  all   40.75    0.00    0.03    0.00    0.00    0.00    0.00   59.22   1027.50
11:10:13 PM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00   1000.50
11:10:13 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM    2  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM    4  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM    5  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM    6    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM    7  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM    8    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00     16.50
11:10:13 PM    9    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   10    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   11    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   12  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00     10.50
11:10:13 PM   13    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   14   99.50    0.00    0.50    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM   15  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM   16    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   17    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   18    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   19  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM   20  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM   21    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   22    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   23    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   24  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM   25  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM   26    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:10:13 PM   27  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
11:10:13 PM   28    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00

where

CPUProcessor number. The keyword all indicates that statistics are calculated as averages among all processors.

%user – Show the percentage of CPU utilization that occurred while executing at the user level (application).

%nice – Show the percentage of CPU utilization that occurred while executing at the user level with nice priority.

%sys  – Show the percentage of CPU utilization that occurred while executing at the system level (kernel). Note that this does not include time spent servicing interrupts or softirqs.

%iowaitShow the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request.

%irq Show the percentage of time spent by the CPU or CPUs to service interrupts.

%softShow the percentage of time spent by the CPU or CPUs to service softirqs. A softirq (software interrupt) is one of up to 32 enumerated software interrupts which can run on multiple CPUs at once.

%stealShow the percentage of time spent in involuntary wait by the virtual CPU or CPUs while the hypervisor was servicing another virtual processor.

%idleShow the percentage of time that the CPU or CPUs were idle and the system did not have an outstanding disk I/O request.

intr/sShow the total number of interrupts received per second by the CPU or CPUs.

2. Getting average from mpstat

To get an average you have to invoke the interval and count argument. In the example, interval is 2 second for 5 count

# mpstat -P ALL 2 5

At the end of the statistics report, you will see an average

Average:     CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
Average:     all   40.76    0.00    0.03    0.00    0.00    0.00    0.00   59.21   1047.50
Average:       0    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00   1000.60
Average:       1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:       2  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:       3    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:       4   99.90    0.00    0.10    0.00    0.00    0.00    0.00    0.00      0.00
Average:       5  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:       6    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:       7  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:       8    0.00    0.00    0.10    0.00    0.00    0.00    0.00   99.90     17.30
Average:       9    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      10    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      11    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      12   99.90    0.00    0.00    0.00    0.00    0.10    0.00    0.00     29.70
Average:      13    0.00    0.00    0.10    0.00    0.00    0.00    0.00   99.90      0.00
Average:      14   99.50    0.00    0.50    0.00    0.00    0.00    0.00    0.00      0.00
Average:      15  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:      16    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      17    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      18    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      19  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:      20  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:      21    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      22    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      23    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      24  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:      25   99.90    0.00    0.10    0.00    0.00    0.00    0.00    0.00      0.00
Average:      26    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      27  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:      28    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      29    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
Average:      30  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00
Average:      31    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00

Installing check_mk for Nagios on CentOS 5

check_mk is a wonderful “a new general purpose Nagios-plugin for retrieving data”. But this wonder plugins is a good replacement for NRPE, NSClients++ etc. I’ve tried using check_mk in place of NSClient++ to monitor my Windows Machines successfully

Installing Nagios is straightforward. You may want to see Blog Entry Using Nagios 2.x/3.x on CentOS. In a nutshell, do this in sequence to avoid dependency issues

# yum install nagios nagios-devel
# yum install nagios-plugins-all

Downloading and unpacking check_mk

# wget http://mathias-kettner.de/download/check_mk-1.1.8.tar.gz
# tar -zxvf check_mk-1.1.8.tar.gz
# cd check_mk-1.1.8
# ./setup.sh --yes

Restart the Service

# service nagios restart
# service apache restart

Making the agent accessible through xinetd

# cp -p /usr/share/check_mk/agents/check_mk_agent.linux /usr/bin/check_mk_agent
# cp -p /usr/share/check_mk/agents/xinetd.conf /etc/xinetd.d/check_mk

Restart xinetd service.

# service xinetd restart

For more information on check_mk on Debian Derivative, do look at the excellent writup “HOWTO: How to install Nagios with check_mk, PNP and NagVis