Ganglia and Gmond Python module for GPUs

If you are running a cluster with NVIDIA GPUs, there now exists a python module for monitoring NVIDIA GPUs using the newly released Python bindings for NVML (NVIDIA Management Library). These bindings are under BSD license and allow simplified access to GPU metrics like temperature, memory usage, and utilization.

Nvidia Developer – Ganglia Monitoring System

To install the Ganglia plug-in on your Ganglia installation, see these download links:

For more information see:

Acknowledgements:

Compiling VASP.6.3.0 with GPGPU Capability using Nvidia HPC-SDK on Rocky Linux 8.5

To Compile VASP with GPGPU Capability using Nvidia HPC-SDK. For more information, do take a look at VASP – Install VASP.6.X.X

VASP support several compilers. But we will be focusing on Nvidia HPC-SDK only for this blog. To download the NVIDIA HPC-SDK

To compile Nvidia HPC SDK, do take a look at HPC SDK Documentation

% tar -xpfz <tarfile>.tar.gz

You may want to use modulefiles provided at hpc-sdk if you are using Module Environment

% module use /usr/local/nvidia/hpc_sdk/modulefiles

You should be able to see something like

------------------- /usr/local/nvidia/hpc_sdk/modulefiles ---------------
nvhpc-byo-compiler/22.5  nvhpc-nompi/22.5  nvhpc/22.5

You can untar the VASP.6.3.3. and potpaw_PBE.54

% tar -xvf vasp.6.3.0.tar
% tar -xvf potpaw_PBE.54.tar 

At the installation base of vasp.6.3.0 base

% cp arch/makefile.include.nvhpc_ompi_mkl_omp_acc ./makefile.include

Load the Nvidia GPGPU SDK and compile. If you are using OneAPI Intel Compilers, you can use module use after compilation. It will not be covered in this write-up.

% module use /usr/local/intel/oneapi-2022/modulefiles
% module load nvhpc/22.5
% module load mkl/latest
% make veryclean
% make DEPS=1 -j

If during the make, you encounter the error

/usr/local/nvidia/hpc_sdk/Linux_x86_64/22.5/comm_libs/openmpi/openmpi-3.1.5/bin/.bin/mpif90: error while loading shared libraries: libatomic.so.1: cannot open shared object file: No such file or directory

You can dnf install libatomic

% dnf install libatomic -y

Try Compiling again

References:

  1. Installing VASP.6.X.X

Compiling Gromacs-2019.3 with Intel MKL and CUDA

Prerequisites

GCC-6.5 Compilers and associates libraries
m4-1.4.18
mpfr-3.1.4
cmake-3.15.1
gmp-6.1.0
mpc-1.0.3

Intel Compilers and Prerequisites

% source /usr/local/intel/2018u3/bin/compilervars.sh intel64
% source /usr/local/intel/2018u3/impi/2018.3.222/bin64/mpivars.sh intel64
% source /usr/local/intel/2018u3/mkl/bin/mklvars.sh intel64
% source /usr/local/intel/2018u3/parallel_studio_xe_2018/bin/psxevars.sh intel64
% MKLROOT=/usr/local/intel/2018u3/mkl

Create a setup file

% touch gromacs_gpgpu.sh

Put the following into the gromacs_cpu.sh

CC=mpicc CXX=mpicxx cmake .. -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx -DGMX_MPI=on -DGMX_FFT_LIBRARY=mkl
-DCMAKE_INSTALL_PREFIX=/usr/local/gromacs-2019.3_intel18_mkl_cuda10.1 -DREGRESSIONTEST_DOWNLOAD=ON
-DCMAKE_C_FLAGS:STRING="-cc=icc -O3 -xHost -ip"
-DCMAKE_CXX_FLAGS:STRING="-cxx=icpc -O3 -xHost -ip -I/usr/local/intel/2018u3/compilers_and_libraries_2018.3.222/linux/mpi/intel64/include/" 
-DGMX_GPU=on 
-DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-10.1
-DCMAKE_BUILD_TYPE=Release
-DCUDA_HOST_COMPILER:FILEPATH=/usr/local/intel/2018u3/compilers_and_libraries_2018.3.222/linux/bin/intel64/icpc
% ./gromacs_gpgpu.sh
% make
% make install

Testing and Verification

$ source /your/installation/prefix/here/bin/GMXRC
./gmxtest.pl all -np 2