Installing pdsh to issue commands to a group of nodes in parallel in CentOS

1. What is pdsh?

Pdsh is a high-performance, parallel remote shell utility. It uses a sliding window of threads to execute remote commands, conserving socket resources while allowing some connections to timeout if needed. It was originally written as a replacement for IBM’s DSH on clusters at LLNL. More information can be found at PDSH Web site

2. Setup EPEL yum repository on CentOS 6.

For more information, see Repository of CentOS 6 and Scientific Linux 6

3. Do a yum install

# yum install pdsh

To confirm installation

# which pdsh

4. Configure user environment for PDSH

# vim /etc/profile.d/pdsh.sh

Edit the following:

# setup pdsh for cluster users
export PDSH_RCMD_TYPE='ssh'
export WCOLL='/etc/pdsh/machines'

5. Put the host name of the Compute Nodes

# vim /etc/pdsh/machines/
node1
node2
node3
.......
.......

6. Make sure the nodes have their SSH-Key Exchange. For more information, see Auto SSH Login without Password
7. Do Install Step 1 to Step 3 on ALL the client nodes.


B. USING PDSH

Run the command (  pdsh [options]… command  )

1. To target all the nodes found at /etc/pdsh/machinefile. Assuming the files are transferred already. Do note that the parallel copy comes with the pdsh utilities

# pdsh -a "rpm -Uvh /root/htop-1.0.2-1.el6.rf.x86_64.rpm"

2. To target specific nodes, you may want to consider using the -x command

# pdsh -x host1,host2 "rpm -Uvh /root/htop-1.0.2-1.el6.rf.x86_64.rpm"

References

  1. Install and setup pdsh on IBM Platform Cluster Manager
  2. PDSH Project Site
  3. PDSH Download Site (Sourceforge)

Using nvidia-smi to get information on GPU Cards

NVIDIA’s System Management Interface (nvidia-smi) is a useful tool to manipulate and control the GPU Cards. There are a few use case listed here

1. Listing of NVIDIA GPU Cards

# nvidia-smi -L

GPU 0: Tesla M2070 (S/N: 03212xxxxxxxx)
GPU 1: Tesla M2070 (S/N: 03212yyyyyyyy)

2. Display GPU information

# nvidia-smi -i 0 -q

==============NVSMI LOG==============

Timestamp : Sun Jul 28 23:49:20 2013

Driver Version : 295.41

Attached GPUs : 2

GPU 0000:19:00.0
Product Name : Tesla M2070
Display Mode : Disabled
Persistence Mode : Disabled
Driver Model
Current : N/A
Pending : N/A
Serial Number : 03212xxxxxxxx
GPU UUID : GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx
VBIOS Version : 70.00.3E.00.03
Inforom Version
OEM Object : 1.0
ECC Object : 1.0
Power Management Object : 1.0
PCI
Bus : 0x19
Device : 0x00
Domain : 0x0000
Device Id : 0xxxxxxxxx
Bus Id : 0000:19:00.0
Sub System Id : 0x083010DE
GPU Link Info
PCIe Generation
Max : 2
Current : 2
Link Width
Max : 16x
Current : 16x
Fan Speed : N/A
Performance State : P0
Memory Usage
Total : 6143 MB
Used : 10 MB
Free : 6132 MB
Compute Mode : Exclusive_Thread
Utilization
Gpu : 0 %
Memory : 0 %
Ecc Mode
Current : Disabled
Pending : Disabled
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Total : N/A
Temperature
Gpu : N/A
Power Readings
Power Management : N/A
Power Draw : N/A
Power Limit : N/A
Clocks
Graphics : 573 MHz
SM : 1147 MHz
Memory : 1566 MHz
Max Clocks
Graphics : 573 MHz
SM : 1147 MHz
Memory : 1566 MHz
Compute Processes : None

3. Display selected GPU Information (MEMORY, UTILIZATION, ECC, TEMPERATURE, POWER, CLOCK, COMPUTE, PIDS, PERFORMANCE)

# nvidia-smi -i 0 -q -d MEMORY,ECC

==============NVSMI LOG==============

Timestamp                       : Mon Jul 29 00:04:36 2013

Driver Version                  : 295.41

Attached GPUs                   : 2

GPU 0000:19:00.0
Memory Usage
Total                   : 6143 MB
Used                    : 10 MB
Free                    : 6132 MB
Ecc Mode
Current                 : Disabled
Pending                 : Disabled
ECC Errors
Volatile
Single Bit
Device Memory   : N/A
Register File   : N/A
L1 Cache        : N/A
L2 Cache        : N/A
Total           : N/A
Double Bit
Device Memory   : N/A
Register File   : N/A
L1 Cache        : N/A
L2 Cache        : N/A
Total           : N/A
Aggregate
Single Bit
Device Memory   : N/A
Register File   : N/A
L1 Cache        : N/A
L2 Cache        : N/A
Total           : N/A
Double Bit
Device Memory   : N/A
Register File   : N/A
L1 Cache        : N/A
L2 Cache        : N/A
Total           : N/A

Turning off and on ECC RAM for NVIDIA GP-GPU Cards

From NVIDIA Developer site.

Turn off ECC (C2050 and later). ECC can cost you up to 10% in performance and hurts parallel scaling. You should verify that your GPUs are working correctly, and not giving ECC errors for example before attempting this. You can turn this off on Fermi based cards and later by running the following command for each GPU ID as root, followed by a reboot:

Extensive testing of AMBER on a wide range of hardware has established that ECC has little to no benefit on the reliability of AMBER simulations. This is part of the reason it is acceptable (see recommended hardware) to use the GeForce gaming cards for AMBER simulations.

To Turn off the ECC RAM, just do a

# nvidia-smi -g 0 --ecc-config=0
(repeat with -g x for each GPU ID)

To Turn back on ECC RAM, just do

# nvidia-smi -g 0 --ecc-config=1
(repeat with -g x for each GPU ID)

Compiling MVAPICH2-1.9 with Intel and CUDA

Step 1: Download the MVAPICH1.19 from the http://mvapich.cse.ohio-state.edu/ . The current version at point of writing is MVAPICH2

Step 2: Compile the MPAPICH2 with intel and cuda.

# tar -zxvf mvapich2-1.9.gz
# cd mvapich2-1.9
# mkdir buildmpi
#  ../configure --prefix=/usr/local/mvapich2-1.9-intel-cuda CC=icc CXX=icpc F77=ifort FC=ifort
--with-cuda=/opt/cuda/ --with-cuda-include=/opt/cuda/include --with-cuda-libpath=/opt/cuda/lib64
# make -j8
# make install

Installing cmake 2.8 on CentOS 5

If you are intending to compile and install cmake 2.8 on CentOS 5, you have to use a slightly older version of cmake 2.8.10.2 is compatible with CentOS 5. You can get it from this link

http://www.cmake.org/files/v2.8/cmake-2.8.10.2.tar.gz

The most recent version of cmake-2.8.11 is only able to be compiled easily on CentOS 6 unless you wish to fix dependencies issues like GBLIC 2.7

Installation is quite easy to compile cmake-2.8.10.2

Step 1: You can use the bootstrap which will default the cmake to default location ie /usr/local/. If you are using bootstrap,

# tar -zxvf cmake-2.8.10.2
# cd cmake-2.8.10.2
# ./bootstrap
# make
# make install

Alternatively, I use the configure command which I’m more accustomed

# tar -zxvf cmake-2.8.10.2
# cd cmake-2.8.10.2
#./configure --prefix=/usr/local/cmake-2.8.10.2
# make
# make install

Compiling OpenMPI 1.7.2 with CUDA and Intel Compilers 13

If you are intending to compile OpenMPI with CUDA Support, do note that you have to download the feature version of OpenMPI. The version I used for compiling OpenMPI with CUDA is version 1.7.2. The current stable version of OpenMPI 1.6.5 does not have CUDA-Support

1. Download and unpack OpenMPI 1.7.2 (features)

# wget http://www.open-mpi.org/software/ompi/v1.7/downloads/openmpi-1.7.2.tar.gz
# tar -zxvf openmpi-1.7.2.tar.gz
# cd openmpi-1.7.2

2. Configure the OpenMPI with CUDA Support

# ./configure --prefix=/usr/local/openmpi-1.7.2-intel-cuda CC=icc CXX=icpc F77=ifort FC=ifort --with-cuda=/opt/cuda --with-cuda-libdir=/usr/lib64
# make -j 8
# make install

References:

  1. 34. How do I build Open MPI with support for sending CUDA device memory?

Compiling GNU 4.8.1 on CentOS 6

I encountered error when compiling GCC 4.8.1 on CentOS 6 I have the prequistics (via yum installed)

  1. gmp 4.3.1-7.el6_2.2
  2. mpfr  2.4.1-6.el6
  3. mpc 0.19-1.el6.rf

But still I enountered the error

configure: error: Building GCC requires GMP 4.2+, MPFR 2.4.0+ and MPC 0.8.0+. 
Try the --with-gmp, --with-mpfr and/or --with-mpc options to specify their locations.  
Source code for these libraries can be found at their respective hosting sites as well as 
at ftp://gcc.gnu.org/pub/gcc/infrastructure/

To resolve the issue, download from the ftp://gcc.gnu.org/pub/gcc/infrastructure/ the following application and compile them.

  1. gmp-4.3.2.tar.bz2
  2. mpfr-2.4.2.tar.bz2
  3. mpc-0.8.1.tar.gz

1. Install gmp-4.3.2

# bunzip2 gmp-4.3.2.tar.bz2
# tar -zxvf gmp-4.3.2.tar
# cd gmp-4.3.2
# ./configure --prefix=/usr/local/gmp-4.3.2
# make
# make install

2. Install mpfr-2.4.2 (requires gmp-4.3.2 as prerequisites)

# bunzip2 mpfr-2.4.2.tar.bz2
# tar -zxvf mpfr-2.4.2.tar
# cd mpfr-2.4.2
# ./configure --prefix=/usr/local/mpfr-2.4.2 --with-gmp=/usr/local/gmp-4.3.2/
# make
# make install

3. Install mpc-0.8.1 (requires gmp-4.3.2 and mpfr-2.4.2 as prerequisites )

# tar -zxvf mpc-0.8.1.tar.gz
# cd mpc-0.8.1
#./configure --prefix=/usr/local/mpc-0.8.1/ --with-gmp=/usr/local/gmp-4.3.2/ --with-mpfr=/usr/local/mpfr-2.4.2
# make
# make install

4. Update your LD_LIBRARY_PATH reflect /usr/local/mpc-0.8.1/lib In your .bash_profile include the following

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/mpc-0.8.1

5. Install the glibc-devel.i686. For more information, do look at Error when compiling GCC 4.8.1 (linuxtoolkit.blogspot.com)
6. Finally install GNU CC 4.8.1

# tar -zxvf  gcc-4.8.1.tar.gz
# cd gcc-4.8.1
# mkdir build-gcc
# cd build-gcc
# ../configure --prefix=/usr/local/gcc-4.8.1 --with-mpfr=/usr/local/mpfr-2.4.2 --with-mpc=/usr/local/mpc-0.8.1 --with-gmp=/usr/local/gmp-4.3.2
# make
# make install

Compiling and Installing Gromacs 4.6.2 with OpenMPI and Intel on CentOS 6

Step 1: Compiling cmake
To compile Gromacs 4.6.2, first you need to compile cmake 2.8 and above. At this point in writing, the current version is cmake 2.8.11.2

# tar -zxvf cmake-2.8.11.2.tar.gz
# cd cmake-2.8.11.2
# ./configure --prefix=/usr/local/cmake-2.8.11.2
# make
# make install

Step 2: Installing OpenMPI
See Blog Entry Building OpenMPI with Intel Compilers

Step 3: Installing FFTW (Single and Double Precision)
See Blog Entry Compiling and installing FFTW 3.3.3.

Step 4: Compiling Gromacs 4.6.2 (Single-Precision with Intel and OpenMPI)

# tar -zxvf gromacs-4.6.2.tar.gz
# cd gromacs-4.6.2
# mkdir build-cmake
# cd build-cmake
# CC=icc CMAKE_PREFIX_PATH=/usr/local/fftw-3.3.3-single/ /usr/local/cmake-2.8.11.2/bin/cmake .. 
-DCMAKE_INSTALL_PREFIX=/usr/local/gromacs-4.6.2-single-intel -DGMX_X11=OFF -DGMX_MPI=ON 
-DGMX_PREFER_STATIC_LIBS=ON -DBUILD_SHARED_LIBS=OFF
# make
# make install

Step 5: Compiling Gromacs 4.6.2 (Double Precision with Intel and OpenMPI)

# tar -zxvf gromacs-4.6.2.tar.gz
# cd gromacs-4.6.2
# mkdir build-cmake
# cd build-cmake
# CC=icc CMAKE_PREFIX_PATH=/usr/local/fftw-3.3.3-single/ /usr/local/cmake-2.8.11.2/bin/cmake .. 
-DCMAKE_INSTALL_PREFIX=/usr/local/gromacs-4.6.2-single-intel -DGMX_X11=OFF -DGMX_MPI=ON 
-DGMX_PREFER_STATIC_LIBS=ON -DBUILD_SHARED_LIBS=OFF -DGMX_DOUBLE=ON
# make 
# make install

For more information,

  1. Compiling Gromacs 4.6 and PLUMED from Source
  2. Gromacs Installation instructions

Compiling and installing FFTW 3.3.3

1. FFTW 3.3.3 (Single Precision)

# ./configure --enable-float --enable-threads
# make
# make install

2. FFTW 3.3.3 (Double Precision)

# ./configure --enable threads
# make
# make install

3. Important Note.

If you using FFTW with Gromacs, you may want to compile without threading as according to the document at Gromacs Installation Instructions

…..On x86 hardware, compile only with --enable-sse2 (regardless of precision) even if your processors can take advantage of AVX extensions. Since GROMACS uses fairly short transform lengths we do not benefit from the FFTW AVX acceleration, and because of memory system performance limitations, it can even degrade GROMACS performance by around 20%……

3a. FFTW 3.3.3 (Single Precision)

# ./configure --enable-float --enable-sse2 
# make 
# make install

3b. FFTW 3.3.3 (Double Precision)

# ./configure --enable-sse2 
# make 
# make install

References:

  1. FFTW Home Page
  2. Installation of FFTW (http://www.uvm.edu/~smanchu/fftw_installation.html)