Compiling VASP.6.3.0 with GPGPU Capability using Nvidia HPC-SDK on Rocky Linux 8.5

To Compile VASP with GPGPU Capability using Nvidia HPC-SDK. For more information, do take a look at VASP – Install VASP.6.X.X

VASP support several compilers. But we will be focusing on Nvidia HPC-SDK only for this blog. To download the NVIDIA HPC-SDK

To compile Nvidia HPC SDK, do take a look at HPC SDK Documentation

% tar -xpfz <tarfile>.tar.gz

You may want to use modulefiles provided at hpc-sdk if you are using Module Environment

% module use /usr/local/nvidia/hpc_sdk/modulefiles

You should be able to see something like

------------------- /usr/local/nvidia/hpc_sdk/modulefiles ---------------
nvhpc-byo-compiler/22.5  nvhpc-nompi/22.5  nvhpc/22.5

You can untar the VASP.6.3.3. and potpaw_PBE.54

% tar -xvf vasp.6.3.0.tar
% tar -xvf potpaw_PBE.54.tar 

At the installation base of vasp.6.3.0 base

% cp arch/makefile.include.nvhpc_ompi_mkl_omp_acc ./makefile.include

Load the Nvidia GPGPU SDK and compile. If you are using OneAPI Intel Compilers, you can use module use after compilation. It will not be covered in this write-up.

% module use /usr/local/intel/oneapi-2022/modulefiles
% module load nvhpc/22.5
% module load mkl/latest
% make veryclean
% make DEPS=1 -j

If during the make, you encounter the error

/usr/local/nvidia/hpc_sdk/Linux_x86_64/22.5/comm_libs/openmpi/openmpi-3.1.5/bin/.bin/mpif90: error while loading shared libraries: libatomic.so.1: cannot open shared object file: No such file or directory

You can dnf install libatomic

% dnf install libatomic -y

Try Compiling again

References:

  1. Installing VASP.6.X.X

Changes to SSH Server on DevCloud

When connecting to the DevCloud for oneAPI

$ ssh devcloud @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! Someone could be eavesdropping on you right now (man-in-the-middle attack)! It is also possible that a host key has just been changed. The fingerprint for the ED25519 key sent by the remote host is SHA256:/Dlip01tdMyRmhMDc870Z4Uk7AancwwoTnbb0EZajK0. Please contact your system administrator. Add correct host key in /home/<user_name>/.ssh/known_hosts to get rid of this message. Offending ECDSA key in /home/<user_name>/.ssh/known_hosts:# Host key for ssh.devcloud.intel.com has changed and you have requested strict checking. Host key verification failed. kex_exchange_identification: Connection closed by remote host Connection closed by UNKNOWN port 65535

Cause:

DevCloud have just migrated oneAPI DevCloud to a new SSH tunnel server and upgraded the SSH server version to OpenSSH _8.2p1. For this reason the DevCloud are unable to reuse the old SSH fingerprint for the new server.

Remediation:

Step 1: Remove the Offending FingerPrint(s)

Method 1: rename your existing ~/.ssh/known_hosts file to something else, such as ~/.ssh/known_hosts.yymmdd

$ mv ~/.ssh/known_hosts ~/.ssh/known_hosts.220623 

Method 2: remove the offending host SSH fingerprint only:

$ ssh-keygen -R ssh.devcloud.intel.com # Host ssh.devcloud.intel.com found: line 1 # Host ssh.devcloud.intel.com found: line 2 # Host ssh.devcloud.intel.com found: line 3 /home/<user_name>/.ssh/known_hosts updated. Original contents retained as /home/<user_name>/.ssh/known_hosts.old 

Step 2: reconnect to the DevCloud and accept the new key.

$ ssh devcloud The authenticity of host 'ssh.devcloud.intel.com (12.229.61.118)' can't be established. ED25519 key fingerprint is SHA256:/Dlip01tdMyRmhMDc870Z4Uk7AancwwoTnbb0EZajK0. This key is not known by any other names Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added 'ssh.devcloud.intel.com' (ED25519) to the list of known hosts.