I had a casual read on the book “Bash Idioms” by Carl Albing. I scribbled what I learned from what stuck me the most. There are so much more. Please read the book instead.
Lesson 1: “A And B” are true only if both A and B are true…..
Example 1: If the cd command succeeds, then execute the “rm -Rv *.tmp” command
cd tmp && rm -Rv *.tmp
Lesson 2: If “A is true”, “B is not executed” and vice versa.
Example 2: Change Directory, if fail, put out the message that the change directory failed and exit
cd /tmp || { echo "cd to /tmp failed" ; exit ; }
Lesson 3: When do we use the [ ] versus [[ ]]?
I learned that the author advises BASH users to use the [[ ]] unless when avoidable. The Double Bracket helps to avoid confusing edge case behaviours that a single bracket may exhibit. If however, the main goal is portability across various platform to non-bash platforms, single quota may be advisable.
LAMMPS is a classical molecular dynamics code with a focus on materials modeling. It’s an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. More Information on the software, do take a look at https://www.lammps.org/
You may want to module use which come in the hpcx installation
export HPCX_HOME=/usr/local/hpcx-v2.15-gcc-MLNX_OFED_LINUX-5-redhat8-cuda12-gdrcopy2-nccl2.17-x86_64 module use $HPCX_HOME/modulefiles
Next, I used the following parameters that suit my HPC Environment. The default installation is already double-precision. I needed MPI, OPenMPI and needs AVX512…..
# ./configure --prefix=/usr/local/fftw-3.3.10 --enable-threads --enable-openmp --enable-mpi --enable-avx512 # make && make install
ORCA is a general-purpose quantum chemistry package that is free of charge for academic users. The Project and Download Website can be found at ORCA Forum. The current version is 5.0.4.
The current prerequisites that I have used were OpenMPI-4.1.1 and System GNU which is 8.5.
Unless I have missed something, the packages of ORCA-5.0.4 has been split into 3 different packages which you have to untar and combine together
orca_5_0_4_linux_x86-64_openmpi411_part1
orca_5_0_4_linux_x86-64_openmpi411_part2
orca_5_0_4_linux_x86-64_openmpi411_part3
How do I untar the packages?
The first thing is to untar all the packages separately first. Assuming you are untarring at the /usr/local/
$ tar -xf orca_5_0_4_linux_x86-64_openmpi411_part1.tar.xz $ tar -xf orca_5_0_4_linux_x86-64_openmpi411_part2.tar.xz $ tar -xf orca_5_0_4_linux_x86-64_openmpi411_part3.tar.xz
How do I do with all the untarred packages?
Copy all the untar files into /usr/local/orca-5.0.4.
If you are not using the Module Environment, you can consider installing. For more information do take a look at Installing Environment Modules on Rocky Linux 8.5. All you need to do is then is to load the additional module such as OpenMPI as a prerequisites. Alternatively, you can set the PATH, LD_LIBRARY_PATH of OpenMPI something like this.
If you encounter slow nvidia-smi before the information is shown. For my 8 x A40 Cards, it took about 26 seconds to initialise.
The reason for slow initialization might be due to the driver persistence issue. For more background on the issue, do take a look at Nvidia Driver Persistence. According to the article,
The NVIDIA GPU driver has historically followed Unix design philosophies by only initializing software and hardware state when the user has configured the system to do so. Traditionally, this configuration was done via the X Server and the GPUs were only initialized when the X Server (on behalf of the user) requested that they be enabled. This is very important for the ability to reconfigure the GPUs without a reboot (for example, changing SLI mode or bus settings, especially in the AGP days).
More recently, this has proven to be a problem within compute-only environments, where X is not used and the GPUs are accessed via transient instantiations of the Cuda library. This results in the GPU state being initialized and deinitialized more often than the user truly wants and leads to long load times for each Cuda job, on the order of seconds.
NVIDIA previously provided Persistence Mode to solve this issue. This is a kernel-level solution that can be configured using nvidia-smi. This approach would prevent the kernel module from fully unloading software and hardware state when no user software was using the GPU. However, this approach creates subtle interaction problems with the rest of the system that have made maintenance difficult.
The purpose of the NVIDIA Persistence Daemon is to replace this kernel-level solution with a more robust user-space solution. This enables compute-only environments to more closely resemble the historically typical graphics environments that the NVIDIA GPU driver was designed around.
..... Installing from scratch into /usr/local/software/cp2k/tools/toolchain/install/sirius-7.5.2
for (auto it : unit_cell_.spl_num_paw_atoms()) {
^
/usr/local/software/cp2k/tools/toolchain/build/SIRIUS-7.5.2/src/potential/potential.hpp:710:9: error: expected iteration declaration or initialization
for (auto it : unit_cell_.spl_num_paw_atoms()) {
^~~
/usr/local/software/cp2k/tools/toolchain/build/SIRIUS-7.5.2/src/potential/potential.hpp:717:5: warning: no return statement in function returning non-void [-Wreturn-type]
} .....
The reason is that the sirius package is installed by default. The issue can be issued if you put the parameters “–with-sirius=no”
% cd cp2k % cd /usr/local/software/cp2k/tools/toolchain % ./install_cp2k_toolchain.sh --no-check-certificate --with-openmpi --with-sirius=no