According to HPL Website,
HPL is a software package that solves a (random) dense linear system in double precision (64 bits) arithmetic on distributed-memory computers. It can thus be regarded as a portable as well as freely available implementation of the High Performance Computing Linpack Benchmark.
The algorithm used by HPL can be summarized by the following keywords: Two-dimensional block-cyclic data distribution – Right-looking variant of the LU factorization with row partial pivoting featuring multiple look-ahead depths – Recursive panel factorization with pivot search and column broadcast combined – Various virtual panel broadcast topologies – bandwidth reducing swap-broadcast algorithm – backward substitution with look-ahead of depth 1.
1. Requirements:
2. Installing BLAS, LAPACK and OpenMPI, do look at
- Building BLAS Library using Intel and GNU Compiler
- Building LAPACK 3.4 with Intel and GNU Compiler
- Building OpenMPI with Intel Compilers
- Compiling ATLAS on CentOS 5
3. Download the latest HPL (hpl-2.1.tar.gz) from http://www.netlib.org
4. Copy Make.Linux_PII_CBLAS file from $(HOME)/hpl-2.1/setup/ to $(HOME)/hpl-2.1/
5. Edit Make.Linux_PII_CBLAS file
# vim ~/hpl-2.1/Make.Linux_PII_CBLAS
# ---------------------------------------------------------------------- # - shell -------------------------------------------------------------- # ---------------------------------------------------------------------- # SHELL = /bin/sh # CD = cd CP = cp LN_S = ln -s MKDIR = mkdir RM = /bin/rm -f TOUCH = touch # # ---------------------------------------------------------------------- # - Platform identifier ------------------------------------------------ # ---------------------------------------------------------------------- # ARCH = Linux_PII_CBLAS # # ---------------------------------------------------------------------- # - HPL Directory Structure / HPL library ------------------------------ # ---------------------------------------------------------------------- # TOPdir = $(HOME)/hpl-2.1 INCdir = $(TOPdir)/include BINdir = $(TOPdir)/bin/$(ARCH) LIBdir = $(TOPdir)/lib/$(ARCH) # HPLlib = $(LIBdir)/libhpl.a # ---------------------------------------------------------------------- # - Message Passing library (MPI) -------------------------------------- # ---------------------------------------------------------------------- # MPinc tells the C compiler where to find the Message Passing library # header files, MPlib is defined to be the name of the library to be # used. The variable MPdir is only used for defining MPinc and MPlib. # MPdir = /usr/local/mpi/intel MPinc = -I$(MPdir)/include MPlib = $(MPdir)/lib/libmpi.so # # ---------------------------------------------------------------------- # - Linear Algebra library (BLAS or VSIPL) ----------------------------- # ---------------------------------------------------------------------- # LAinc tells the C compiler where to find the Linear Algebra library # header files, LAlib is defined to be the name of the library to be # used. The variable LAdir is only used for defining LAinc and LAlib. # LAdir = /usr/local/atlas/lib LAinc = LAlib = $(LAdir)/libcblas.a $(LAdir)/libatlas.a # ..... ..... ..... # ---------------------------------------------------------------------- # - Compilers / linkers - Optimization flags --------------------------- # ---------------------------------------------------------------------- # CC = /usr/local/mpi/intel/bin/mpicc CCNOOPT = $(HPL_DEFS) CCFLAGS = $(HPL_DEFS) -fomit-frame-pointer -O3 -funroll-loops # # On some platforms, it is necessary to use the Fortran linker to find # the Fortran internals used in the BLAS library. # LINKER = /usr/local/mpi/intel/bin/mpicc LINKFLAGS = $(CCFLAGS) # ARCHIVER = ar ARFLAGS = r RANLIB = echo # # ----------------------------------------------------------------------
6. Compile the HPL
# make arch=Linux_PII_CBLAS
Running the LinPack on multiple Nodes
$ cd ~/hpl-2.0/bin/Linux_PII_CBLAS
$ mpirun -np 16 --host node1,node2 ./xhpl
7. The output…..
..... ..... ..... T/V N NB P Q Time Gflops -------------------------------------------------------------------------------- WR00R2R4 35 4 4 1 0.00 4.019e-02 -------------------------------------------------------------------------------- ||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0108762 ...... PASSED ================================================================================ Finished 864 tests with the following results: 864 tests completed and passed residual checks, 0 tests completed and failed residual checks, 0 tests skipped because of illegal input values. -------------------------------------------------------------------------------- End of Tests. ================================================================================
Is this to be done only on the frontend node?
LikeLike