Webinar: High Performance GPU Acceleration – Part 1: Code Design

  • Online Registration Here
  • Date: 13th October 2021 9am PDF

Heterogeneous computing comes with the challenge of designing code that can work in multi-processor/accelerator environments. Developers need to be equipped with the right set of metrics to make informed design and optimization decisions that take advantage of target hardware.

In Part 1 of this 2-part webinar series, Technical Consulting Engineer Cory Levels focuses on designing software for efficient offload from CPUs to GPUS—even before final hardware is available—using Intel® Advisor. Using a walkthrough of an ISO 3DFD example (3D isotropic Finite Difference), you will learn how to:

  • Optimize your CPU application for memory and compute
  • Identify efficient GPU offload opportunities and quantify the potential performance speed up
  • See performance headroom of your GPU offloaded code against hardware limitations, and get insights for an effective optimization roadmap

For More information, do take a look at the Intel Site Here.

Compiling plumed-2.6.4 with Intel 2019

PLUMED is a plugin that works with a large number of molecular dynamics codes (Codes interfaced with PLUMED ). It can be used to analyze features of the dynamics on-the-fly or to perform a wide variety of free energy methods. PLUMED can also work as a Command Line Tools to perform analysis on trajectories saved in most of the existing formats.

The Installation guide can be found Plumed Installation

Step 1: Source the Intel Compiler Environments. At least MKL, Compilers and MPI Environments should be

% source /usr/local/intel/2019u5/mkl/bin/mklvars.sh intel64
% source /usr/local/intel/2019u5/compilers_and_libraries/linux/bin/compilervars.sh intel64
% source /usr/local/intel/2019u5/impi/2019.5.281/intel64/bin/mpivars.sh intel64

Step 2: Download and Untar the Plumed Codes. For Plumed-2.8.4, you can download from https://github.com/plumed/plumed2/releases/tag/v2.6.4

Step 3: Compile the Codes.

% ./configure --prefix=/usr/local/plumed2-2.6.4_i2019 CC=mpiicc CXX=mpiicpc CXXFLAGS=-O3 --enable-mpi --disable-xdrfile LDFLAGS=-L/usr/local/intel/2019u5/mkl/lib/intel64  CPPFLAGS=-I/usr/local/intel/2019u5/mkl/include
% make -j 4
% make install

Compiling ANTs with GNU-6.5

What is Advanced Normalization Tools?

ANTS is a tool for computational neuroanatomy based on medical images. ANTS reads any image type that can be read by ITK (www.itk.org), that is, jpg, tiff, hdr, nii, nii.gz, mha/d and more image types as well. For the most part, ANTS will output float images which you can convert to other types with the ANTS
ConvertImagePixelType tool. ImageMath has a bunch of basic utilities such as multiplication, inversion and many more advanced tools such as computation of the Lipschitz norm of a deformation field. ANTS
programs may be called from the command line on almost any platform.

ANTs project site can be found at GitHub – ANTsX/ANTs: Advanced Normalization Tools (ANTs). Compilation Information can found at Compiling ANTs on Linux and Mac OS · ANTsX/ANTs Wiki · GitHub

Prerequisites

  • gnu-6.5
  • m4-1.4.18
  • gmp-6.1.0
  • mpfr-3.1.4
  • mpc-1.0.3
  • isl-0.18
  • gsl-2.1
  • cmake-3.21.3

ANTs can be not too difficult if you use their installation script found here

% mkdir /usr/local/ANTs
% cd /usr/local/ANTs
% git clone https://github.com/cookpa/antsInstallExample.git
% ./installANTs.sh

Once done, you should see in the ANTs directory

ANTs  build  install  installANTs.sh

Inside ANTs, you can see the install directory where the bin and lib lies

Intel unveil Second-Generation Neuromorphic Chip

Various processors and pieces of code are often compared to brains, but neuromorphic chips work to much more directly mimic neurological systems through the use of computational “neurons” that communicate with one another. Intel’s first-generation Loihi chip, introduced in 2017, has around 128,000 of those digital neurons. Over the ensuing four years, Loihi has been packed into increasingly large systemslearned to touch and even been taught to smell.

Now, it’s getting a new family member: Loihi 2. In its press release, Intel said that years of testing with the first-generation Loihi chip helped them to design a second generation with up to ten times the processing speed; up to 15 times greater resource density; and up to a million computational neurons per chip – more than seven times those in the first generation. Intel reports that early tests have shown that Loihi 2 required more than 60 times fewer ops per inference when running deep neural networks as compared to Loihi 1 (without a loss in accuracy).

Intel Unveils Loihi 2, Its Second-Generation Neuromorphic Chip, HPCWire

The C++ compiler does not support C++11 during bootstrap for cmake

What is CMAKE?

CMake is an open-source, cross-platform family of tools designed to build, test and package software. CMake is used to control the software compilation process using simple platform and compiler independent configuration files, and generate native makefiles and workspaces that can be used in the compiler environment of your choice. 

You can download the latest cmake from https://cmake.org/download/

Prerequisites that I use

  • gnu-6.5
  • m4-1.4.18
  • gmp-6.1.0
  • mpfr-3.1.4
  • mpc-1.0.3
  • isl-0.18
  • gsl-2.1

Step 1: You can use the bootstrap which will default the cmake to default location ie /usr/local/. If you are using bootstrap,

# tar -zxvf cmake-3.21.3.tar.gz
# cd cmake-3.21.3 
# ./bootstrap 
# make 
# make install

Errors encountered

CMake 3.21.3, Copyright 2000-2021 Kitware, Inc. and Contributors
Found GNU toolchain
C compiler on this system is: gcc
C++ compiler on this system is: g++  -std=gnu++1y
Makefile processor on this system is: gmake
g++ has setenv
g++ has unsetenv
g++ does not have environ in stdlib.h
g++ has stl wstring
g++ has <ext/stdio_filebuf.h>
---------------------------------------------
gmake: Warning: File `Makefile' has modification time 0.15 s in the future
gmake: `cmake' is up to date.
gmake: warning:  Clock skew detected.  Your build may be incomplete.
loading initial cache file /myhome/melvin/Downloads/cmake-3.21.3/Bootstrap.cmk/InitialCacheFlags.cmake
CMake Error at CMakeLists.txt:107 (message):
  The C++ compiler does not support C++11 (e.g.  std::unique_ptr).


-- Configuring incomplete, errors occurred!
See also "/myhome/melvin/Downloads/cmake-3.21.3/CMakeFiles/CMakeOutput.log".
See also "/myhome/melvin/Downloads/cmake-3.21.3/CMakeFiles/CMakeError.log".

Resolutions:

Step 1: You may want to export this before compiling

export CXXFLAGS="-O3"

Step 2: You might want to move to an unmounted directory like /root and try compiling again with root access.

Alternatively, instead of using ./boostrap, you can use the traditional configure command

#./configure --prefix=/usr/local/cmake-3.21.3
# make
# make install

References:

  1. The C++ compiler does not support C++11 (e.g. std::unique_ptr). building OpenWRT
  2. c++11 std::unique_ptr error cmake 3.11.3 bootstrap

Displaying Intel-MPI Debug Information

The Detailed Information can be found at Displaying MPI Debug Information

The I_MPI_DEBUG environment variable provides a convenient way to get detailed information about an MPI application at runtime. You can set the variable value from 0 (the default value) to 1000. The higher the value, the more debug information you get.

High values of I_MPI_DEBUG can output a lot of information and significantly reduce performance of your application. A value of I_MPI_DEBUG=5 is generally a good starting point, which provides sufficient information to find common errors.

Displaying MPI Debug Information

To redirect the debug information output from stdout to stderr or a text file, use the I_MPI_DEBUG_OUTPUT environment variable

$ mpirun -genv I_MPI_DEBUG=5 -genv I_MPI_DEBUG_OUTPUT=debug_output.txt -n 32 ./mpi_program

I_MPI_DEBUG Arguments

<level>Indicate the level of debug information provided.
0Output no debugging information. This is the default value.
1Output libfabric* version and provider.
2Output information about the tuning file used.
3Output effective MPI rank, pid and node mapping table.
4Output process pinning information.
5Output environment variables specific to the Intel® MPI Library.
> 5Add extra levels of debug information.
<flags>Comma-separated list of debug flags
pidShow process id for each debug message.
tidShow thread id for each debug message for multithreaded library.
timeShow time for each debug message.
datetimeShow time and date for each debug message.
hostShow host name for each debug message.
levelShow level for each debug message.
scopeShow scope for each debug message.
lineShow source line number for each debug message.
fileShow source file name for each debug message.
nofuncDo not show routine name.
norankDo not show rank.
nousrwarnSuppress warnings for improper use case (for example, incompatible combination of controls).
flockSynchronize debug output from different process or threads.
nobufDo not use buffered I/O for debug output.

References:

  1. Displaying MPI Debug Information
  2. Developer Reference: I_MPI_DEBUG

Installing Intel® oneAPI AI Analytics Toolkit

What is included in the Intel oneAPI AI Analytics Toolkit? For more information, do take a look at Intel OneAPI Al Analytics Toolkit

  • Intel® Distribution for Python*
  • Intel® Distribution of Modin* (via Anaconda distribution of the toolkit using the Conda package manager)
  • Intel® Low Precision Optimization Tool
  • Intel® Optimization for PyTorch*
  • Intel® Optimization for TensorFlow*
  • Model Zoo for Intel® Architecture
  • Download size: 2.18 GB
  • Date: August 2, 2021
  • Version: 2021.3

Command Line Installation

wget https://registrationcenter-download.intel.com/akdlm/irc_nas/18040/l_AIKit_p_2021.3.0.1370_offline.sh

sudo bash l_AIKit_p_2021.3.0.1370_offline.sh

Installation Instruction

Step 1: From the console, locate the downloaded install file.
Step 2: Use $ sudo sh ./<installer>.sh to launch the GUI Installer as the root.
Optionally, use $ sh ./<installer>.sh to launch the GUI Installer as the current user.
Step 3: Follow the instructions in the installer.
Step 4: Explore the Get Started Guide.

References:

  1. Intel OneAPI Al Analytics Toolkit

Installing Intel OneAPI HPC Toolkit for Linux

What is included in the OneAPI Installer? For more information, do take a look at Get the Intel® oneAPI HPC Toolkit

  • Intel® oneAPI DPC++/C++ Compiler
  • Intel® oneAPI Fortran Compiler
  • Intel® C++ Compiler Classic
  • Intel® Cluster Checker
  • Intel® Inspector
  • Intel® MPI Library
  • Intel® Trace Analyzer and Collector
  • Download size: 1.25 GB
  • Version: 2021.3
  • Date: June 21, 2021
wget https://registrationcenter-download.intel.com/akdlm/irc_nas/17912/l_HPCKit_p_2021.3.0.3230_offline.sh

sudo bash l_HPCKit_p_2021.3.0.3230_offline.sh

Installation Instruction:

  • Step 1: From the console, locate the downloaded install file.
  • Step 2: Use $ sudo sh ./<installer>.sh to launch the GUI Installer as root.
    Optionally, use $ sh ./<installer>.sh to launch the GUI Installer as current user.
  • Step 3: Follow the instructions in the installer.
  • Step 4: Explore the Get Started Guide.

References:

Error: Too many elements extracted from the MEAM Library on LAMMPS

If you encounter an errors similar

ERROR: Too many elements extracted from MEAM library (current limit:5 ). Increase 'maxelt' in meam.h and recompile. 
Last command: pair_coeff     * * library.alloy2.meam .............................

Move to /usr/local/lammps-29Oct20/src/USER-MEAMC/meam.h and /usr/local/lammps-29Oct20/src/meam.h. Edit line 22. The default value is #define maxelt 5

#definte maxelt 6

Recompile the lammps. Go to /usr/local/lammps-29Oct20/src

% make clean-all
% make g++_openmpi -j 16

References

  1. Compiling LAMMPS-15Jun20 with GNU 6 and OpenMPI 3

One Hundred Year Study on Artificial Intelligence, or AI100

A newly published report on the state of artificial intelligence says the field has reached a turning point where attention must be paid to the everyday applications and even abuses of AI technology

“In the past five years, AI has made the leap from something that mostly happens in research labs or other highly controlled settings to something that’s out in society affecting people’s lives,” Brown University computer scientist Michael Littman, who chaired the report panel, said in a news release.

“That’s really exciting, because this technology is doing some amazing things that we could only dream about five or ten years ago,” Littman added. “But at the same time the field is coming to grips with the societal impact of this technology, and I think the next frontier is thinking about ways we can get the benefits from AI while minimizing the risks.”

Those risks include deep-fake images and videos that are used to spread misinformation or harm people’s reputations; online bots that are used to manipulate public opinionalgorithmic bias that infects AI with all-too-human prejudices; and pattern recognition systems that can invade personal privacy by piecing together data from multiple sources.

The report says computer scientists must work more closely with experts in the social sciences, the legal system and law enforcement to reduce those risks.

References: