February 25, 2024 by kittycool only

Shortening BASH Commands (Part 1)

I had a casual read on the book “Bash Idioms” by Carl Albing. I scribbled what I learned from what stuck me the most. There are so much more. Please read the book instead.

Lesson 1: “A And B” are true only if both A and B are true…..

Example 1: If the cd command succeeds, then execute the “rm -Rv *.tmp” command

cd tmp && rm -Rv *.tmp

Lesson 2: If “A is true”, “B is not executed” and vice versa.

Example 2: Change Directory, if fail, put out the message that the change directory failed and exit

cd /tmp || { echo "cd to /tmp failed" ; exit ; }

Lesson 3: When do we use the [ ] versus [[ ]]?

I learned that the author advises BASH users to use the [[ ]] unless when avoidable. The Double Bracket helps to avoid confusing edge case behaviours that a single bracket may exhibit. If however, the main goal is portability across various platform to non-bash platforms, single quota may be advisable.

February 15, 2024 by kittycool only

Building LAMMPS using CMAKE with OpenMPI on Rocky Linux 8

What is LAMMPS (briefly)?

LAMMPS is a classical molecular dynamics code with a focus on materials modeling. It’s an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. More Information on the software, do take a look at https://www.lammps.org/

Where to Download?

You can download the latest stable from Download LAMMPS

Step 1: Ensure Prerequisites are present

You may want to use HPCX which has optimised OpenMPI. Do take a look at Installing and using Mellanox HPC-X Software Toolkit
You will need FFTW. Compiling FFTW-3.3.10 with OpenMPI on Rocky Linux 8

Step 2: Download and build LAMMPS

For more information, do take a look at https://docs.lammps.org/Install_tarball.html

$ tar -zxvf lammps-stable.tar.gz

$ cd lammps-2Aug203
$ mkdir build
$ touch setup.sh
$ vim setup.sh

Inside the setup.sh

cmake   -C ../cmake/presets/most.cmake ../cmake             \
        -D CMAKE_INSTALL_PREFIX=/usr/local/lammps-2Aug2023  \
        -D BUILD_MPI=on                                     \
        -D BUILD_SHARED_LIBS=yes                            \
        -D FFT=FFTW3                                        \
        -D FFTW3_INCLUDE_DIRS=${FFTW_INC}                   \
        -D FFTW3_LIBRARIES=${FFTW_LIB}/libfftw3_mpi.a

Notes:
The -C ../cmake/presets/most.cmake command adds the packages that don’t need extra libraries.

Make and Compile……

$ make -j 16 
$ make install

References:

February 13, 2024 by kittycool only

Compiling FFTW-3.3.10 with OpenMPI on Rocky Linux 8

For detailed explanation and information, do take a look at FFTW Installation on UNIX. For my installation.

We will be focusing on using Nvidia hpcx only for this blog. To compile Nvidia hpcx, do take a look at Installing and using Mellanox HPC-X Software Toolkit

You may want to module use which come in the hpcx installation

export HPCX_HOME=/usr/local/hpcx-v2.15-gcc-MLNX_OFED_LINUX-5-redhat8-cuda12-gdrcopy2-nccl2.17-x86_64
module use $HPCX_HOME/modulefiles

Next, I used the following parameters that suit my HPC Environment. The default installation is already double-precision. I needed MPI, OPenMPI and needs AVX512…..

# ./configure --prefix=/usr/local/fftw-3.3.10 --enable-threads --enable-openmp --enable-mpi --enable-avx512
# make && make install

References:

FFTW Installation on UNIX

February 12, 2024 by kittycool only

Installing ORCA-5.0.4 on Rocky Linux 8 with OpenMPI

ORCA is a general-purpose quantum chemistry package that is free of charge for academic users. The Project and Download Website can be found at ORCA Forum. The current version is 5.0.4.

The current prerequisites that I have used were OpenMPI-4.1.1 and System GNU which is 8.5.

Unless I have missed something, the packages of ORCA-5.0.4 has been split into 3 different packages which you have to untar and combine together

orca_5_0_4_linux_x86-64_openmpi411_part1
orca_5_0_4_linux_x86-64_openmpi411_part2
orca_5_0_4_linux_x86-64_openmpi411_part3

How do I untar the packages?

The first thing is to untar all the packages separately first. Assuming you are untarring at the /usr/local/

$ tar -xf orca_5_0_4_linux_x86-64_openmpi411_part1.tar.xz
$ tar -xf orca_5_0_4_linux_x86-64_openmpi411_part2.tar.xz
$ tar -xf orca_5_0_4_linux_x86-64_openmpi411_part3.tar.xz

How do I do with all the untarred packages?

Copy all the untar files into /usr/local/orca-5.0.4.

cp -rv ../orca_5_0_4_linux_x86-64_openmpi411_part1/* .
cp -rv ../orca_5_0_4_linux_x86-64_openmpi411_part2/* .
cp -rv ../orca_5_0_4_linux_x86-64_openmpi411_part3/* .

How to Compile OpenMPI-4.1.1?

Although the Compiling OpenMPI-4.1.5 for ROCEv2 with GNU-8.5 is of a higher version of OpenMPI, the principle and parameters can still be used.

How do I Put them Together?

If you are not using the Module Environment, you can consider installing. For more information do take a look at Installing Environment Modules on Rocky Linux 8.5. All you need to do is then is to load the additional module such as OpenMPI as a prerequisites. Alternatively, you can set the PATH, LD_LIBRARY_PATH of OpenMPI something like this.

export OPENMPI_HOME=/usr/local/openmpi-4.1.1
export PATH=$PATH:$OPENMPI_HOME/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$OPENMPI_HOME/lib:$OPENMPI_HOME/lib64
export MANPATH=$MANPATH:$OPENMPI_HOME/share
export PATH=$PATH:/usr/local/orca-5.0.4

If you are using without Module Environment, you may want to

orca $INPUT > $OUTPUT

References:

Installing ORCA

February 11, 2024 by kittycool only

Nvidia-smi slow startup fix

If you encounter slow nvidia-smi before the information is shown. For my 8 x A40 Cards, it took about 26 seconds to initialise.

The reason for slow initialization might be due to the driver persistence issue. For more background on the issue, do take a look at Nvidia Driver Persistence. According to the article,

The NVIDIA GPU driver has historically followed Unix design philosophies by only initializing software and hardware state when the user has configured the system to do so. Traditionally, this configuration was done via the X Server and the GPUs were only initialized when the X Server (on behalf of the user) requested that they be enabled. This is very important for the ability to reconfigure the GPUs without a reboot (for example, changing SLI mode or bus settings, especially in the AGP days).

More recently, this has proven to be a problem within compute-only environments, where X is not used and the GPUs are accessed via transient instantiations of the Cuda library. This results in the GPU state being initialized and deinitialized more often than the user truly wants and leads to long load times for each Cuda job, on the order of seconds.

NVIDIA previously provided Persistence Mode to solve this issue. This is a kernel-level solution that can be configured using nvidia-smi. This approach would prevent the kernel module from fully unloading software and hardware state when no user software was using the GPU. However, this approach creates subtle interaction problems with the rest of the system that have made maintenance difficult.

The purpose of the NVIDIA Persistence Daemon is to replace this kernel-level solution with a more robust user-space solution. This enables compute-only environments to more closely resemble the historically typical graphics environments that the NVIDIA GPU driver was designed around.
Nvidia Driver Persistence

The Solution is very easy. Just start and enable nvidia-persistenced

# systemctl enable nvidia-persistenced
# systemctl start nvidia-persistenced

Immediately, the nvidia-smi command becomes more responsive

February 9, 2024 by kittycool only

Errors with sirius-7.5.2 packages during installation

If you are installing based on Installing CP2K with Nvidia HPCX on Rocky Linux 8.5 and if you encounter the issue

.....
Installing from scratch into /usr/local/software/cp2k/tools/toolchain/install/sirius-7.5.2

         for (auto it : unit_cell_.spl_num_paw_atoms()) {

             ^

/usr/local/software/cp2k/tools/toolchain/build/SIRIUS-7.5.2/src/potential/potential.hpp:710:9: error: expected iteration declaration or initialization

         for (auto it : unit_cell_.spl_num_paw_atoms()) {

         ^~~

/usr/local/software/cp2k/tools/toolchain/build/SIRIUS-7.5.2/src/potential/potential.hpp:717:5: warning: no return statement in function returning non-void [-Wreturn-type]

     }
.....

The reason is that the sirius package is installed by default. The issue can be issued if you put the parameters “–with-sirius=no”

% cd cp2k
% cd /usr/local/software/cp2k/tools/toolchain
% ./install_cp2k_toolchain.sh --no-check-certificate --with-openmpi --with-sirius=no

February 8, 2024 by kittycool only

/usr/bin/ld: cannot find -liberty on Rocky Linux 8

I was trying to install CP2K on Rocky Linux 8 using the Installing CP2K with Nvidia HPCX on Rocky Linux 8.5 and I encountered an issue

libtool: warning: library '/usr/local/hpcx-v2.17.1-gcc-mlnx_ofed-redhat8-cuda12-x86_64/ucc/lib/libucc.la' was moved.
/usr/bin/ld: cannot find -liberty
collect2: error: ld returned 1 exit status

The Resolution can be done by installing binutils-devel

# dnf install binutils-devel

The Linux Cluster

Linux Cluster Blog is a collection of how-to and tutorials for Linux Cluster and Enterprise Linux

Month: February 2024