Compiling airss-0.9.1 with GNU

Ab initio random structure searching (AIRSS) is a very simple, yet powerful and highly parallel, approach to structure prediction. For more information, Do take a look at https://airss-docs.github.io/

Installation Guide can be found at https://airss-docs.github.io/getting-started/installation/

 

Installation is quite a breeze. But there are a few additional steps if you have issues.

Step 1: Download airss package

% wget https://www.mtg.msm.cam.ac.uk/files/airss-0.9.1.tgz

Step 2: Load the GNU Compilers (Must be greater than version 5 and above)

Step 3: Compiling the application

% tar -xvf airss-0.9.1.tgz
% cd airss-0.9.1
% make

If you have error such as

% wget -c --timeout=1 --tries=2 -nv https://www.mtg.msm.cam.ac.uk/files/symmol.zip
Read error (Connection timed out) in headers
.....
.....

You may want to download manually and placed the symmol.zip at airss-0.9.1/external/symmol. Run the make again.

% make ; make install ; make neat
% make check
.....
.....
--------------------
Tests run in .check:
--------------------

Running example 1.1 (Crystals):

Al-44343-3411-1 -0.00 7.564 -6.654 8 Al Fm-3m 1
Al-44343-3411-2 -0.00 8.465 0.797 8 Al C2/m 1

Running example 1.2 (Clusters):

Al-44735-1896-1 0.00 615.385 -3.410 13 Al Ih 1
Al-44735-1896-2 0.00 615.385 0.362 13 Al C1 1
.....
.....

Fixing zlib Dependencies Issues for kaldi

What is Kaldi?

Kaldi is a toolkit for speech recognition, intended for use by speech recognition researchers and professionals. Find the code repository at http://github.com/kaldi-asr/kaldi.

Downloading Kaldi

% git clone https://github.com/kaldi-asr/kaldi.git

Checking Dependencies

Before you can compile, you may want to check the required dependencies

% cd kaldi/tools
% vim INSTALL

Running the Dependencies Test

% cd kaldi/tools/extras

Load Intel Compilers and MLK

export MKLROOT=/usr/local/intel_2018/mkl/lib

Run the Check_Dependencies Test

% ./check_dependencies.sh
./check_dependencies.sh: zlib is not installed.
./check_dependencies.sh: Some prerequisites are missing; install them using the command:
sudo yum install zlib-devel

(This error is strange because becuase zlib and zlib-devel has already been installed)

I used instead a compiled zlib. For more information how to compile zlib, see Compile zlib-1.2.8 with Intel-15.0.6

Next CXXFLAGS on your .bashrc.

CXXFLAGS="-I/usr/local/zlib-1.2.11/include"

Source ~/.bashrc once again.

% source ~/.bashrc
[user1@node1 extras]$ ./check_dependencies.sh
./check_dependencies.sh: all OK.

Tools to Show your System Configuration

Tool 1: Display Information about CPU Architecture

[user1@node1 ~]$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz
Stepping: 4
CPU MHz: 3200.000
BogoMIPS: 6400.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 25344K
NUMA node0 CPU(s): 0-7,16-23
NUMA node1 CPU(s): 8-15,24-31
.....
.....

Tool 2: List all PCI devices

[user1@node1 ~]# lspci -t -vv
-+-[0000:d7]-+-00.0-[d8]--
| +-01.0-[d9]--
| +-02.0-[da]--
| +-03.0-[db]--
| +-05.0 Intel Corporation Device 2034
| +-05.2 Intel Corporation Sky Lake-E RAS Configuration Registers
| +-05.4 Intel Corporation Device 2036
| +-0e.0 Intel Corporation Device 2058
| +-0e.1 Intel Corporation Device 2059
| +-0f.0 Intel Corporation Device 2058
| +-0f.1 Intel Corporation Device 2059
| +-10.0 Intel Corporation Device 2058
| +-10.1 Intel Corporation Device 2059
| +-12.0 Intel Corporation Sky Lake-E M3KTI Registers
| +-12.1 Intel Corporation Sky Lake-E M3KTI Registers
| +-12.2 Intel Corporation Sky Lake-E M3KTI Registers
| +-12.4 Intel Corporation Sky Lake-E M3KTI Registers
| +-12.5 Intel Corporation Sky Lake-E M3KTI Registers
| +-15.0 Intel Corporation Sky Lake-E M2PCI Registers
| +-16.0 Intel Corporation Sky Lake-E M2PCI Registers
| +-16.4 Intel Corporation Sky Lake-E M2PCI Registers
| \-17.0 Intel Corporation Sky Lake-E M2PCI Registers
.....
.....

Tool 3: List all PCI devices

[user@node1 ~]# lsblk
sda 8:0 0 1.1T 0 disk
├─sda1 8:1 0 200M 0 part /boot/efi
├─sda2 8:2 0 1G 0 part /boot
└─sda3 8:3 0 1.1T 0 part
├─centos-root 253:0 0 50G 0 lvm /
├─centos-swap 253:1 0 4G 0 lvm [SWAP]
└─centos-home 253:2 0 1T 0 lvm

Tool 4: See flags kernel booted with

[user@node1 ~]
BOOT_IMAGE=/vmlinuz-3.10.0-693.el7.x86_64 root=/dev/mapper/centos-root ro crashkernel=auto rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet LANG=en_US.UTF-8

Tool 5: Display available network interfaces

[root@hpc-gekko1 ~]# ifconfig -a
eno1: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether XX:XX:XX:XX:XX:XX txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
.....
.....
.....

Tool 6 Using dmidecode to find hardware information
See Using dmidecode to find hardware information

Benchmarking Tools for Memory Bandwidth

What is Bandwidth

Bandwidth, is an artificial benchmark primarily for measuring memory bandwidth on x86 and x86_64 based computers, useful for identifying weaknesses in a computer’s memory subsystem, in the bus architecture, in the cache architecture and in the processor itself.

bandwidth also tests some libc functions and, under GNU/Linux, it attempts to test framebuffer memory access speed if the framebuffer device is available.

Prerequisites:

NASM, GNU Compiler Suite

Compiling NASM

Bandwidth-1.94 requires the latest version of NASM.

% tar -xvf nasm-2.15.05.tar.gz
% cd nasm-2.15.05
% ./configure
% make
% make install

You should have nasm binary. Make sure you update $PATH to reflect the path of the nasm binary

Compiling Bandwidth-1.94

% tar -zxvf bandwidth-1.9.4.tar.gz
% cd bandwidth-1.9.4
% make bandwidth64

You should have bandwidth64 binary

Run the Test

% ./bandwidth64
Sequential read (64-bit), size = 256 B, loops = 1132462080, 55292.9 MB/s
Sequential read (64-bit), size = 384 B, loops = 765632322, 56075.0 MB/s
Sequential read (64-bit), size = 512 B, loops = 573833216, 56028.0 MB/s
Sequential read (64-bit), size = 640 B, loops = 457595948, 55857.6 MB/s
Sequential read (64-bit), size = 768 B, loops = 382990923, 56092.5 MB/s
Sequential read (64-bit), size = 896 B, loops = 326929770, 55865.7 MB/s
Sequential read (64-bit), size = 1024 B, loops = 285671424, 55789.1 MB/s
Sequential read (64-bit), size = 1280 B, loops = 229320072, 55973.6 MB/s
Sequential read (64-bit), size = 2 kB, loops = 143425536, 56016.5 MB/s
Sequential read (64-bit), size = 3 kB, loops = 95550030, 55977.6 MB/s
Sequential read (64-bit), size = 4 kB, loops = 71729152, 56036.7 MB/s
Sequential read (64-bit), size = 6 kB, loops = 47510700, 55667.7 MB/s
Sequential read (64-bit), size = 8 kB, loops = 35856384, 56020.1 MB/s
Sequential read (64-bit), size = 12 kB, loops = 23738967, 55631.2 MB/s
Sequential read (64-bit), size = 16 kB, loops = 17666048, 55199.2 MB/s
Sequential read (64-bit), size = 20 kB, loops = 14139216, 55228.2 MB/s
Sequential read (64-bit), size = 24 kB, loops = 11771760, 55178.0 MB/s
Sequential read (64-bit), size = 28 kB, loops = 10097100, 55212.2 MB/s
Sequential read (64-bit), size = 32 kB, loops = 8679424, 54246.3 MB/s
Sequential read (64-bit), size = 34 kB, loops = 7160732, 47543.7 MB/s
Sequential read (64-bit), size = 36 kB, loops = 6404580, 45029.4 MB/s
Sequential read (64-bit), size = 40 kB, loops = 5729724, 44762.0 MB/s
Sequential read (64-bit), size = 48 kB, loops = 4782960, 44837.4 MB/s
Sequential read (64-bit), size = 64 kB, loops = 3603456, 45042.9 MB/s
Sequential read (64-bit), size = 128 kB, loops = 1806848, 45168.2 MB/s
Sequential read (64-bit), size = 192 kB, loops = 1204753, 45175.8 MB/s
Sequential read (64-bit), size = 256 kB, loops = 897792, 44882.4 MB/s
Sequential read (64-bit), size = 320 kB, loops = 711144, 44435.3 MB/s
Sequential read (64-bit), size = 384 kB, loops = 590070, 44254.7 MB/s
Sequential read (64-bit), size = 512 kB, loops = 440064, 43995.8 MB/s
Sequential read (64-bit), size = 768 kB, loops = 285005, 42741.0 MB/s
Sequential read (64-bit), size = 1024 kB, loops = 170048, 34006.4 MB/s
Sequential read (64-bit), size = 1280 kB, loops = 120615, 30152.0 MB/s
Sequential read (64-bit), size = 1536 kB, loops = 91434, 27427.4 MB/s
Sequential read (64-bit), size = 1792 kB, loops = 77688, 27180.4 MB/s
Sequential read (64-bit), size = 2048 kB, loops = 64320, 25722.9 MB/s
Sequential read (64-bit), size = 2304 kB, loops = 56252, 25313.3 MB/s
Sequential read (64-bit), size = 2560 kB, loops = 49550, 24772.9 MB/s
Sequential read (64-bit), size = 2816 kB, loops = 47334, 26023.8 MB/s
Sequential read (64-bit), size = 3072 kB, loops = 41916, 25142.8 MB/s
Sequential read (64-bit), size = 3328 kB, loops = 37525, 24388.1 MB/s
Sequential read (64-bit), size = 3584 kB, loops = 35982, 25184.6 MB/s
Sequential read (64-bit), size = 4096 kB, loops = 31824, 25457.4 MB/s
Sequential read (64-bit), size = 5120 kB, loops = 25128, 25116.7 MB/s
Sequential read (64-bit), size = 6144 kB, loops = 22460, 26948.8 MB/s
Sequential read (64-bit), size = 7168 kB, loops = 18081, 25309.1 MB/s
Sequential read (64-bit), size = 8192 kB, loops = 14952, 23921.5 MB/s
Sequential read (64-bit), size = 9216 kB, loops = 13692, 24642.6 MB/s
Sequential read (64-bit), size = 10240 kB, loops = 12144, 24280.2 MB/s
Sequential read (64-bit), size = 12288 kB, loops = 9465, 22713.4 MB/s
Sequential read (64-bit), size = 14336 kB, loops = 7628, 21357.8 MB/s
Sequential read (64-bit), size = 15360 kB, loops = 6580, 19735.0 MB/s
Sequential read (64-bit), size = 16384 kB, loops = 6068, 19413.2 MB/s
Sequential read (64-bit), size = 20480 kB, loops = 3636, 14541.5 MB/s
Sequential read (64-bit), size = 21504 kB, loops = 3741, 15711.6 MB/s
Sequential read (64-bit), size = 32768 kB, loops = 1266, 8102.1 MB/s
Sequential read (64-bit), size = 49152 kB, loops = 900, 8640.0 MB/s
Sequential read (64-bit), size = 65536 kB, loops = 566, 7238.3 MB/s
Sequential read (64-bit), size = 73728 kB, loops = 609, 8765.8 MB/s
Sequential read (64-bit), size = 98304 kB, loops = 455, 8726.8 MB/s
Sequential read (64-bit), size = 131072 kB, loops = 331, 8461.2 MB/s

There is an interesting collection of commentaries at https://zsmith.co/bandwidth.php

Installing Julia Programming Language on CentOS 7

Julia is a high-level, high-performance dynamic language for technical computing. The main homepage for Julia can be found at julialang.org.

Step 1: The latest Julia Programming language binary can be found at downloaded at

% wget https://julialang-s3.julialang.org/bin/linux/x64/1.5/julia-1.5.3-linux-x86_64.tar.gz

Step 2: Extract the Julia Tar GZ

% tar -zxvf julia-1.5.3-linux-x86_64.tar.gz

Step 3: Move the unpacked folder to /usr/local

% mv julia-1.5.3 /usr/local

Try Julia

Compiling Quantum ESPRESSO-6.7.0 with BEEF and Intel MPI 2018 on CentOS 7

Step 1: Download Quantum ESPRESSO 6.5.0 from Quantum ESPRESSO v. 6.7

% tar -zxvf q-e-qe-6.7.0.tar.gz

Step 2: Remember to source the Intel Compilers and indicate MKLROOT in your .bashrc

source /usr/local/intel/2018u3/mkl/bin/mklvars.sh intel64
source /usr/local/intel/2018u3/parallel_studio_xe_2018/bin/psxevars.sh intel64
source /usr/local/intel/2018u3/compilers_and_libraries/linux/bin/compilervars.sh intel64
source /usr/local/intel/2018u3/impi/2018.3.222/bin64/mpivars.sh intel64

Step 3a: Install libbeef

  1. Installing libbeef with Intel Compilers

Step 3b: Make a file call setup.sh and copy the contents inside

export F90=mpiifort
export F77=mpiifort
export MPIF90=mpiifort
export CC=mpiicc
export CPP="icc -E"
export CFLAGS=$FCFLAGS
export AR=xiar
#export LDFLAGS="-L/usr/local/intel/2018u3/mkl/lib/intel64"
#export CPPLAGS="-I/usr/local/intel/2018u3/mkl/include/fftw"
export BEEF_LIBS="-L/usr/local/libbeef-0.1.3/lib -lbeef"
export BLAS_LIBS="-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core"
export LAPACK_LIBS="-lmkl_blacs_intelmpi_lp64"
export SCALAPACK_LIBS="-lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64"
#export FFT_LIBS="-lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_scalapack_lp64"
./configure --enable-parallel --enable-openmp --enable-shared --with-scalapack=intel --prefix=/usr/local/espresso-6.7.0 | tee Configure.out
% ./setup.sh
% make all
% make install

(I prefer make all without the “make all -j 16” so I can see all the errors and trapped seriously)

If successful, you will see in your bin folder

Using qperf to measure network bandwidth and latency

qperf is a network bandwidth and latency measurement tool which works over many transports including TCP/IP, RDMA, UDP and SCTP

Step 1: Installing qperf on both Server and Client

Server> # yum install qperf
Client> # yum install qperf

Step 2: Open Up the Firewall on the Server

# firewall-cmd --permanent --add-port=19765-19766/tcp
# firewall-cmd --reload

The Server listens at TCP Port 19765 by default. Do note that once qperf makes a connection, it will create a control port and data port , the default data port is 19765 but we also need to enable a data port to run test which can be 19766.

Step 3a: Have the Server listen (as qperf server)

Server> $ qperf

Step 4: Connect to qperf Server with qperf Client and measure bandwidth

Client > $ qperf -ip 19766 -t 60 qperf_server_ip_address tcp_bw
tcp_bw:
    bw  =  2.52 GB/sec

Step 5: Connect to qperf Server with qperf Client and measure latency

qperf -vvs qperf_server_ip_address tcp_lat
tcp_lat:
latency = 20.7 us
msg_rate = 48.1 K/sec
loc_send_bytes = 48.2 KB
loc_recv_bytes = 48.2 KB
loc_send_msgs = 48,196
loc_recv_msgs = 48,196
rem_send_bytes = 48.2 KB
rem_recv_bytes = 48.2 KB
rem_send_msgs = 48,197
rem_recv_msgs = 48,197

References and material Taken from:

  1. How to use qperf to measure network bandwidth and latency performance?