The SPEChpc 2021 Benchmark suite

The full writeup can be found at REAL-WORLD HPC GETS THE BENCHMARK IT DESERVES

While nothing can beat the notoriety of the long-standing LINPACK benchmark, the metric by which supercomputer performance is gauged, there is ample room for a more practical measure. It might not garner the same mainstream headlines as the Top 500 list of the world’s largest systems, but a new benchmark may fill in the gaps between real-world versus theoretical peak compute performance.

The reason this new high performance computing (HPC) benchmark can come out of the gate with immediate legitimacy is because it is from the Standard Performance Evaluation Corporation (SPEC) organization, which has been delivering system benchmark suites since the late 1980s. And the reason it is big news today is because the time is right for a more functional, real-world measure, especially one that can adequately address the range of architectures and changes in HPC (from various accelerators to new steps toward mixed precision, for example).

…..
…..
…..

The SPEChpc 2021 suite includes a broad swath of science and engineering codes that are representative (and portable ) across much of what we see in HPC.

– A tested set of benchmarks with performance measurement and validation built into the test harness.
– Benchmarks include full and mini applications covering a wide range of scientific domains and Fortran/C/C++ programming languages.
– Comprehensive support for multiple programming models, including MPI, MPI+OpenACC, MPI+OpenMP, and MPI+OpenMP with target offload.
– Support for most major compilers, MPI libraries, and different flavors of Linux operating systems.
– Four suites, Tiny, Small, Medium, and Large, with increasing workload sizes, allows for appropriate evaluation of different-sized HPC systems, ranging from a single node to many thousands of nodes.

REAL-WORLD HPC GETS THE BENCHMARK IT DESERVES at The Next Platform

For more information, see https://www.spec.org/hpc2021/

RELION – Performance Benchmark and Profiling

What is RELION?

RELION (REgularized LIkelihood OptimizatioN) is an open-source program for the refinement of macromolecular structures by single-particle analysis of electron cryomicroscopy (cryo-EM) data

RELION (REgularized LIkelihood OptimizatioN) implements an empirical Bayesian approach for analysis of electron cryo-microscopy (Cryo-EM)

RELION provides refinement methods of singular or multiple 3D reconstructions as well as 2D class averages

RELION is an important tool in the study of living cells

HPC-AI Advisory Council

Performance Analysis Summary

(from Article See RELION – Performance Benchmark and Profiling)

RELION performance testing

  • Pool size 4,8,16 gave best performance on 16,24,32 nodes
  • SHARP In-Network Computing reduces MPI time by 13% and increase overall application performance by 5
  • Performance advantages increases with system size, up to 32 nodes were tested

RELION Profile

  • Rank #0 does not perform computation
  • Mostly MPI_Barrier (70%)
  • Ring communication matrix

References:

Benchmarking Tools for Memory Bandwidth

What is Bandwidth

Bandwidth, is an artificial benchmark primarily for measuring memory bandwidth on x86 and x86_64 based computers, useful for identifying weaknesses in a computer’s memory subsystem, in the bus architecture, in the cache architecture and in the processor itself.

bandwidth also tests some libc functions and, under GNU/Linux, it attempts to test framebuffer memory access speed if the framebuffer device is available.

Prerequisites:

NASM, GNU Compiler Suite

Compiling NASM

Bandwidth-1.94 requires the latest version of NASM.

% tar -xvf nasm-2.15.05.tar.gz
% cd nasm-2.15.05
% ./configure
% make
% make install

You should have nasm binary. Make sure you update $PATH to reflect the path of the nasm binary

Compiling Bandwidth-1.94

% tar -zxvf bandwidth-1.9.4.tar.gz
% cd bandwidth-1.9.4
% make bandwidth64

You should have bandwidth64 binary

Run the Test

% ./bandwidth64
Sequential read (64-bit), size = 256 B, loops = 1132462080, 55292.9 MB/s
Sequential read (64-bit), size = 384 B, loops = 765632322, 56075.0 MB/s
Sequential read (64-bit), size = 512 B, loops = 573833216, 56028.0 MB/s
Sequential read (64-bit), size = 640 B, loops = 457595948, 55857.6 MB/s
Sequential read (64-bit), size = 768 B, loops = 382990923, 56092.5 MB/s
Sequential read (64-bit), size = 896 B, loops = 326929770, 55865.7 MB/s
Sequential read (64-bit), size = 1024 B, loops = 285671424, 55789.1 MB/s
Sequential read (64-bit), size = 1280 B, loops = 229320072, 55973.6 MB/s
Sequential read (64-bit), size = 2 kB, loops = 143425536, 56016.5 MB/s
Sequential read (64-bit), size = 3 kB, loops = 95550030, 55977.6 MB/s
Sequential read (64-bit), size = 4 kB, loops = 71729152, 56036.7 MB/s
Sequential read (64-bit), size = 6 kB, loops = 47510700, 55667.7 MB/s
Sequential read (64-bit), size = 8 kB, loops = 35856384, 56020.1 MB/s
Sequential read (64-bit), size = 12 kB, loops = 23738967, 55631.2 MB/s
Sequential read (64-bit), size = 16 kB, loops = 17666048, 55199.2 MB/s
Sequential read (64-bit), size = 20 kB, loops = 14139216, 55228.2 MB/s
Sequential read (64-bit), size = 24 kB, loops = 11771760, 55178.0 MB/s
Sequential read (64-bit), size = 28 kB, loops = 10097100, 55212.2 MB/s
Sequential read (64-bit), size = 32 kB, loops = 8679424, 54246.3 MB/s
Sequential read (64-bit), size = 34 kB, loops = 7160732, 47543.7 MB/s
Sequential read (64-bit), size = 36 kB, loops = 6404580, 45029.4 MB/s
Sequential read (64-bit), size = 40 kB, loops = 5729724, 44762.0 MB/s
Sequential read (64-bit), size = 48 kB, loops = 4782960, 44837.4 MB/s
Sequential read (64-bit), size = 64 kB, loops = 3603456, 45042.9 MB/s
Sequential read (64-bit), size = 128 kB, loops = 1806848, 45168.2 MB/s
Sequential read (64-bit), size = 192 kB, loops = 1204753, 45175.8 MB/s
Sequential read (64-bit), size = 256 kB, loops = 897792, 44882.4 MB/s
Sequential read (64-bit), size = 320 kB, loops = 711144, 44435.3 MB/s
Sequential read (64-bit), size = 384 kB, loops = 590070, 44254.7 MB/s
Sequential read (64-bit), size = 512 kB, loops = 440064, 43995.8 MB/s
Sequential read (64-bit), size = 768 kB, loops = 285005, 42741.0 MB/s
Sequential read (64-bit), size = 1024 kB, loops = 170048, 34006.4 MB/s
Sequential read (64-bit), size = 1280 kB, loops = 120615, 30152.0 MB/s
Sequential read (64-bit), size = 1536 kB, loops = 91434, 27427.4 MB/s
Sequential read (64-bit), size = 1792 kB, loops = 77688, 27180.4 MB/s
Sequential read (64-bit), size = 2048 kB, loops = 64320, 25722.9 MB/s
Sequential read (64-bit), size = 2304 kB, loops = 56252, 25313.3 MB/s
Sequential read (64-bit), size = 2560 kB, loops = 49550, 24772.9 MB/s
Sequential read (64-bit), size = 2816 kB, loops = 47334, 26023.8 MB/s
Sequential read (64-bit), size = 3072 kB, loops = 41916, 25142.8 MB/s
Sequential read (64-bit), size = 3328 kB, loops = 37525, 24388.1 MB/s
Sequential read (64-bit), size = 3584 kB, loops = 35982, 25184.6 MB/s
Sequential read (64-bit), size = 4096 kB, loops = 31824, 25457.4 MB/s
Sequential read (64-bit), size = 5120 kB, loops = 25128, 25116.7 MB/s
Sequential read (64-bit), size = 6144 kB, loops = 22460, 26948.8 MB/s
Sequential read (64-bit), size = 7168 kB, loops = 18081, 25309.1 MB/s
Sequential read (64-bit), size = 8192 kB, loops = 14952, 23921.5 MB/s
Sequential read (64-bit), size = 9216 kB, loops = 13692, 24642.6 MB/s
Sequential read (64-bit), size = 10240 kB, loops = 12144, 24280.2 MB/s
Sequential read (64-bit), size = 12288 kB, loops = 9465, 22713.4 MB/s
Sequential read (64-bit), size = 14336 kB, loops = 7628, 21357.8 MB/s
Sequential read (64-bit), size = 15360 kB, loops = 6580, 19735.0 MB/s
Sequential read (64-bit), size = 16384 kB, loops = 6068, 19413.2 MB/s
Sequential read (64-bit), size = 20480 kB, loops = 3636, 14541.5 MB/s
Sequential read (64-bit), size = 21504 kB, loops = 3741, 15711.6 MB/s
Sequential read (64-bit), size = 32768 kB, loops = 1266, 8102.1 MB/s
Sequential read (64-bit), size = 49152 kB, loops = 900, 8640.0 MB/s
Sequential read (64-bit), size = 65536 kB, loops = 566, 7238.3 MB/s
Sequential read (64-bit), size = 73728 kB, loops = 609, 8765.8 MB/s
Sequential read (64-bit), size = 98304 kB, loops = 455, 8726.8 MB/s
Sequential read (64-bit), size = 131072 kB, loops = 331, 8461.2 MB/s

There is an interesting collection of commentaries at https://zsmith.co/bandwidth.php