Encountering shm_open permission denied issues with hpcx

If you are using Nvidia hpc-x and encountering issues like the one below during your MPI Run

shm_open(file_name=/ucx_shm_posix_77de2cf3 flags=0xc2) failed: Permission denied

The error message indicates that the shared memory has no permission to be used,  The permission of /dev/shm is found to be 755, not 777, causing the error. The issue can be resolved after the permission is changed to 777. To change and verify the changes:

% chmod 777 /dev/shm 
% ls -ld /dev/shm
drwxrwxrwx 2 root root 40 Jul  6 15:18 /dev/sh

Installing CP2K with Nvidia HPCX on Rocky Linux 8.5

What is HPCX?

NVIDIA® HPC-X® is a comprehensive software package that includes Message Passing Interface (MPI), Symmetrical Hierarchical Memory (SHMEM) and Partitioned Global Address Space (PGAS) communications libraries, and various acceleration packages. For more information, do take a look at https://developer.nvidia.com/networking/hpc-x

What is CP2K?

CP2K is a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems. CP2K provides a general framework for different modeling methods such as DFT using the mixed Gaussian and plane waves approaches GPW and GAPW. Supported theory levels include DFTB, LDA, GGA, MP2, RPA, semi-empirical methods (AM1, PM3, PM6, RM1, MNDO, …), and classical force fields (AMBER, CHARMM, …). CP2K can do simulations of molecular dynamics, metadynamics, Monte Carlo, Ehrenfest dynamics, vibrational analysis, core level spectroscopy, energy minimisation, and transition state optimization using NEB or dimer method. (Detailed overview of features.). For more information, do take a look at https://www.cp2k.org/

Getting the CP2K

git clone --recursive https://github.com/cp2k/cp2k.git cp2k

Unpack hpcx and Optimised OpenMPI Libraries. For more information on installation, do take a look at Installing and Loading HPC-X

Extract hpcx.tbz into your current working directory.

% tar -xvf hpcx.tbz
% cd hpcx
% export HPCX_HOME=$PWD
% module use $HPCX_HOME/modulefiles
% module load hpcx

Use the CP2K Toolchain to Compile for the easiest

% cd cp2k
% cd /usr/local/software/cp2k/tools/toolchain
% ./install_cp2k_toolchain.sh --no-check-certificate --with-openmpi

Compiling the CP2K

==================== generating arch files ====================
arch files can be found in the /usr/local/software/cp2k/tools/toolchain/install/arch subdirectory
Wrote /usr/local/software/cp2k/tools/toolchain/install/arch/local.ssmp
Wrote /usr/local/software/cp2k/tools/toolchain/install/arch/local_static.ssmp
Wrote /usr/local/software/cp2k/tools/toolchain/install/arch/local.sdbg
Wrote /usr/local/software/cp2k/tools/toolchain/install/arch/local_coverage.sdbg
Wrote /usr/local/software/cp2k/tools/toolchain/install/arch/local.psmp
Wrote /usr/local/software/cp2k/tools/toolchain/install/arch/local.pdbg
Wrote /usr/local/software/cp2k/tools/toolchain/install/arch/local_static.psmp
Wrote /usr/local/software/cp2k/tools/toolchain/install/arch/local_warn.psmp
Wrote /usr/local/software/cp2k/tools/toolchain/install/arch/local_coverage.pdbg
========================== usage =========================
Now copy:
  cp /usr/local/software/cp2k/tools/toolchain/install/arch/* to the cp2k/arch/ directory
To use the installed tools and libraries and cp2k version
compiled with it you will first need to execute at the prompt:
  source /usr/local/software/cp2k/tools/toolchain/install/setup
To build CP2K you should change directory:
  cd cp2k/
  make -j 80 ARCH=local VERSION="ssmp sdbg psmp pdbg"

Do exactly on the ending instruction

% cp /usr/local/software/cp2k/tools/toolchain/install/arch/* /usr/local/cp2k/arch
% source /usr/local/software/cp2k/tools/toolchain/install/setup
% cd /usr/local/software/cp2k
% make -j 32 ARCH=local VERSION="ssmp sdbg psmp pdbg"

If you encounter an error during making like the one below, just do an install for liblsan

% /usr/bin/ld: cannot find /usr/lib64/liblsan.so.0.0.0
% dnf install liblsan -y

If you encounter error like the ones below for fftw libraries,

/usr/bin/ld: cannot find -lfftw3_mpi
collect2: error: ld returned 1 exit status

You have to go to the supporting package libraries and do some editing.

% cd /usr/local/software/cp2k/tools/toolchain/install/fftw-3.3.10/lib
% ln -s libfftw3.a libfftw3_mpi.a
% ln -s libfftw3.la libfftw3_mpi.la

Try again

% cd /usr/local/software/cp2k
% make -j 32 ARCH=local VERSION="ssmp sdbg psmp pdbg"

If successful, you should see binaries at /usr/local/software/cp2k/exe/local

Efficient Heterogeneous Parallel Programming Using OpenMP

This article is taken from Intel “Efficient Heterogeneous Parallel Programming Using OpenMP”. In this article, we will show you how to do CPU+GPU asynchronous calculations using OpenMP.

In some cases, offloading computations to an accelerator like a GPU means that the host CPU sits idle until the offloaded computations are finished. However, using the CPU and GPU resources simultaneously can improve the performance of an application. In OpenMP® programs that take advantage of heterogenous parallelism, the master clause can be used to exploit simultaneous CPU and GPU execution. In this article, we will show you how to do CPU+GPU asynchronous calculation using OpenMP.

The Intel® oneAPI DPC++/C++ Compiler was used with following command-line options:
‑O3 ‑Ofast ‑xCORE‑AVX512 ‑mprefer‑vector‑width=512 ‑ffast‑math ‑qopt‑multiple‑gather‑scatter‑by‑shuffles ‑fimf‑precision=low
‑fiopenmp ‑fopenmp‑targets=spir64=”‑fp‑model=precise”

OpenMP provides true asynchronous, heterogeneous execution on CPU+GPU systems. It’s clear from our timing results and VTune profiles that keeping the CPU and GPU busy in the OpenMP parallel region gives the best performance. We encourage you to try this approach.

Intel: Efficient Heterogeneous Parallel Programming Using OpenMP (Best Practices to Keep the CPU and GPU Working at the Same Time)

Compiling ORCA-4.2.1 with OpenMPI-3.1.4

ORCA is a general-purpose quantum chemistry package that is free of charge for academic users. The Project and Download Website can be found at ORCA Forum

You have to register yourself before you can participate in the forum or download ORCA-4.2.1. The current latest version for ORCA is 5.0.3. The package you might want to consider is ORCA 4.2.1, Linux, x86-64, .tar.xz Archive

Prerequisites that I use.

Unpacking ORCA-4.2.1

% tar -xvf orca_4_2_1_linux_x86-64_openmpi314.tar.xz

Running ORCA. If your environment has Module Environment

% module load openmpi/3.1.4/gcc-6.5.0

If not, you have to pacify PATH and LD_LIBRARY_PATH, MANPATH


Typical Input file

Calling ORCA requires full pathing

/usr/local/orca_4_2_1_linux_x86-64_openmpi314/orca $INPUT > $OUTPUT "--bind-to core --verbose"

For Input File usage, you may want to take a look at the ORCA 4.2.1 Manual found when you unpack or you can look at it online at orca_manual_4_2_1.pdf (enea.it) .

For example…….

! B3LYP def2-SVP SP
tda false
nroots 50
triplets true
nprocs 32

* xyz 0 1 fac_irppy3.xyz
  Ir        0.00000        0.00000        0.03016
   N       -1.05797        1.55546       -1.09121
   N        1.87606        0.13850       -1.09121

Compiling LAMMPS-15Jun20 with GNU 6 and OpenMPI 3



Download the latest tar.gz from https://lammps.sandia.gov/

Step 1: Untar LAMMPS

% tar -zxvf lammps-stable.tar.gz

Step 2: Go to $LAMMPS_HOME/src. Make Standard Packages

% cd src
% make yes-standard
% make no-gpu
% make no-mscg

Step 3: Compile message libraries

% cd lammps-15Jun20/lib/message/cslib/src
% make lib_parallel zmq=no

Copy and rename the produced cslib/src/libcsmpi.a or libscnompi.a file to cslib/src/libmessage.a

% cp cslib/src/libcsmpi.a cslib/src/libmessage.a

Copy either lammps-15Jun20/lib/message/Makefile.lammps.zmq or Makefile.lammps.nozmq to lib/message/Makefile.lammps

% cp Makefile.lammps.nozmq Makefile.lammps

Step 4: Compile poems

% cd lammps-15Jun20/lib/poems
% make -f Makefile.g++

Step 5: Compile latte
Download LATTE code and unpack the tarball either in this /lib/latte directory

% git clone https://github.com/lanl/LATTE

Inside lammps-15Jun20/lib/latte/LATTE
Modify the makefile.CHOICES according to your system architecture and compilers

% cd lammps-15Jun20/lib/latte/LATTE
% cp makefile.CHOICES makefile.CHOICES.gfort
% make
% cd lammps-15Jun20/lib/latte
% ln -s ./LATTE/src includelink
% ln -s ./LATTE liblink
% ln -s ./LATTE/src/latte_c_bind.o filelink.o
% cp Makefile.lammps.gfortran Makefile.lammps

Step 6. Compile Voronoi

Download voro++-0.4.6.tar.gz from http://math.lbl.gov/voro++/download/
Untar the voro++-0.4.6.tar.gz inside lammps-15Jun20/lib/voronoi/

% tar -zxvf voro++-0.4.6.tar.gz
% cd lammps-15Jun20/lib/voronoi/voro++-0.4.6
% make

Step 7: Compile kim

Download kim from https://openkim.org/doc/usage/obtaining-models . The current version is  kim-api-2.1.3.txz

Download at /lammps-15Jun20/lib/kim

% cd lammps-15Jun20/lib/kim
% tar Jxvf kim-api-2.1.3.txz
% cd kim-api-2.1.3
% mkdir build
% cd build
% cmake .. -DCMAKE_INSTALL_PREFIX=${PWD}/../../installed-kim-api-2.1.3
% make -j2
% make install
% cd /lammps-15Jun20/lib/kim/installed-kim-api-2.1.3/
% source ${PWD}/kim-api-X2.1.3/bin/kim-api-activate
% kim-api-collections-management install system EAM_ErcolessiAdams_1994_Al__MO_324507536345_002

Step 8: Compile USER-COLVARS

% cd lammps-15Jun/lib/colvars
% make -f Makefile.g++

Step 9: Check Packages Status

% make package-status
[root@hpc-gekko1 src]# make package-status
Installed YES: package ASPHERE
Installed YES: package BODY
Installed YES: package CLASS2
Installed YES: package COLLOID
Installed YES: package COMPRESS
Installed YES: package CORESHELL
Installed YES: package DIPOLE
Installed  NO: package GPU
Installed YES: package GRANULAR
Installed YES: package KIM

Step 9a: To Activate Standard Package

% make yes-standard

Step 9b: To activate USER-COLVARS, USER-OMP

% make yes-user-colvars
% make yes-user-omp

Step 9c: To deactivate GPGPU

% make no-gpu

Step 10: Finally Compile LAMMPS

% cd lammps-15Jun20/src
% make g++_openmpi -j 16

You should have binary called lmp_g++_openmpi
Do a softlink

ln -s lmp_g++_openmpi lammps

Compiling OpenMPI-3.1.6 with GCC-6.5

We assumed that you have installed GNU 6.5 and isl-0.15

Download the latest OpenMPI 3.1.6 package from OpenMPI site

% ./configure --prefix=/usr/local/gnu/openmpi-3.1.6 --enable-orterun-prefix-by-default --enable-mpi-cxx --enable-openib-rdmacm-ibaddr --enable-mca-no-build=btl-uct

–enable-orterun-prefix-by-default (Configure OMPI –enable-orterun-prefix-by-default and so that you do not need to add the prefix option)
–enable-openib-rdmacm-ibaddr (To enable routing over IB)
–enable-mpi-cxx (C++ bindings are no more built by default)
–enable-mca-no-build=btl-uct (ecent OpenMPI versions contain a BTL component called ‘uct’, which can cause data corruption when enabled, due to conflict on malloc hooks between OPAL and UCM.)

% make all install | tee install.log


  1. Intel Community – Caught Signal 11 (Segmentation Fault: Does not mapped to object at)
  2. Open MPI + Scalasca :Can not run mpirun command with option –prefix?