This is a continuation of the article Testing the Infiniband Interconnect Performance with Intel MPI Benchmark (Part I)
B. Running IMB
After “make” the executable has been located. Run IMB_MPI1 pingpong from management node or head node. Ensure the IMB-MPT1 is on the directory.
# cd /home/hpc/imb/src
# mpirun -np 16 -host node1,node2 /home/hpc/imb/src/IMB-MPI1 pingpong
# mpirun -np 16 -host node1,node2 /home/hpc/imb/src/IMB-MPI1 sendrecv
# mpirun -np 16 -host node1,node2 /home/hpc/imb/src/IMB-MPI1 exchange
Example of output from “pingpong”
benchmarks to run pingpong #--------------------------------------------------- # Intel (R) MPI Benchmark Suite V3.2.2, MPI-1 part #--------------------------------------------------- # Date : Mon Feb 7 10:42:48 2011 # Machine : x86_64 # System : Linux # Release : 2.6.18-164.el5 # Version : #1 SMP Thu Sep 3 03:28:30 EDT 2009 # MPI Version : 2.1 # MPI Thread Environment: MPI_THREAD_SINGLE # New default behavior from Version 3.2 on: # the number of iterations per message size is cut down # dynamically when a certain run time (per message size sample) # is expected to be exceeded. Time limit is defined by variable # "SECS_PER_SAMPLE" (=> IMB_settings.h) # or through the flag => -time # Calling sequence was: # /home/shared-rpm/imb/src/IMB-MPI1 pingpong # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # PingPong #--------------------------------------------------- # Benchmarking PingPong # #processes = 2 # ( 46 additional processes waiting in MPI_Barrier) #--------------------------------------------------- #bytes #repetitions t[usec] Mbytes/sec 0 1000 8.74 0.00 1 1000 8.82 0.11 2 1000 8.83 0.22 4 1000 8.89 0.43 8 1000 8.90 0.86 16 1000 8.99 1.70 32 1000 9.00 3.39 64 1000 10.32 5.91 128 1000 10.52 11.60 256 1000 11.24 21.72 512 1000 12.12 40.30 1024 1000 13.76 70.98 2048 1000 15.55 125.59 4096 1000 17.81 219.35 8192 1000 22.47 347.67 16384 1000 45.24 345.41 32768 1000 59.83 522.29 65536 640 87.68 712.85 131072 320 154.80 807.47 262144 160 312.87 799.05 524288 80 556.20 898.96 1048576 40 1078.94 926.84 2097152 20 2151.90 929.41 4194304 10 4256.70 939.69 # All processes entering MPI_Finalize
If you wish to use the torque to run the IMB, do read the IBM “Setting up an HPC cluster with Red Hat Enterprise Linux“