Traditional methods for performing data reductions are very costly in terms of latency and CPU cycles. The NVIDIA Quantum InfiniBand switch with NVIDIA SHARP technology addresses complex operations such as data reduction in a simplified, efficient way. By reducing data within the switch network, NVIDIA Quantum switches perform the reduction in a fraction of the time of traditional methods.
Mellanox
Basic Commands for Mellanox Network Switches for Break-out-Ports
More information can be found at Command Line Interface (CLI)
Point 1: To configure Break-Out
> enable
# configure terminal
# interface ethernet ?
R2-R8-LEAF01 [standalone: master] (config) # interface ethernet ?
<Device/Port>[-<Device/Port>]
1/1/1
1/1/2
1/1/3
1/1/4
1/3/1
1/3/2
1/3/3
1/3/4
1/5/1
1/5/2
1/5/3
1/5/4
1/7/1
1/7/2
1/7/3
1/7/4
1/9/1
1/9/2
1/9/3
1/9/4
.....
.....
1/25
1/26
1/27
1/28
1/29
1/30
1/31
1/32
# interface ethernet 1/25 shutdown
# interface ethernet 1/26 shutdown
# interface ethernet 1/25
# (config interface ethernet 1/25) # module-type qsfp-split-4 force
The resulting interface will become
Ethernet 1/25/1
Ethernet 1/25/2
Ethernet 1/25/3
Ethernet 1/25/4
Speed configuration can be found at
interface ethernet 1/25/1
# speed 25G
EOL notice for Mellanox ConnectX-5 VPI host channel adapters and Switch-IB 2 based EDR InfiniBand Switches
Nvidia Corporation has announced the EOL Notice #LCR-000906 – MELLANOX
PCN INFORMATION:
PCN Number: LCR-000906 – MELLANOX
PCN Description: EOL notice for Mellanox ConnectX-5 VPI host channel adapters and Switch-IB 2 based EDR InfiniBand Switches
Publish Date: Sun May 08 00:00:00 GMT 2022
Type: FYI
Top 500 Interconnect Trends
Published twice a year and publicly available at www.top500.org, the TOP500 supercomputing list ranks the world’s most powerful computer systems according to the Linpack benchmark rating system.

Summary of Findings for Nvidia Networking.
- NVIDIA GPU or Network (InfiniBand, Ethernet) accelerate 342 systems or 68% of overall TOP500 systems
- InfiniBand accelerates seven of the top ten supercomputers in the world
- NVIDIA BlueField DPU and HDR InfiniBand Networking accelerate the world’s 1st academic cloud-native supercomputer at Cambridge University
- NVIDIA InfiniBand and Ethernet networking solutions connect 318 systems or 64% of overall TOP500 platforms
- InfiniBand accelerates 170 systems, 21% growth compared to June 2020 TOP500 list
- InfiniBand accelerates #1, #2 supercomputers in the US, #1 in China, #1, #2 and #3 in Europe
- NVIDIA 25 gigabit and faster Ethernet solutions connect 62% of total Ethernet systems
What is the difference between a DPU, a CPU, and a GPU?
An interesting blog to explain what is the difference a DPU, CPU, and GPU?
So What Makes a DPU Different?
A DPU is a new class of programmable processor that combines three key elements. A DPU is a system on a chip, or SOC, that combines:
An industry standard, high-performance, software programmable, multi-core CPU, typically based on the widely-used Arm architecture, tightly coupled to the other SOC components
A high-performance network interface capable of parsing, processing, and efficiently transferring data at line rate, or the speed of the rest of the network, to GPUs and CPUs
A rich set of flexible and programmable acceleration engines that offload and improve applications performance for AI and Machine Learning, security, telecommunications, and storage, among others.
For more information, do take a look at What’s a DPU? …And what’s the difference between a DPU, a CPU, and a GPU?
Best Practices to Secure the Edge Cloud Environment
In this webinar you will learn:
- Challenges in securing edge data centers
- How to secure the edge cloud without compromising on application performance
- The role of NVIDIA Mellanox DPU in securing cloud to edge
Date: Aug 4, 2020
Time: 2:00pm SGT | 11:30am IST | 4:00pm AEST
To register: https://www.mellanox.com/webinar/best-practices-secure-edge-cloud-environment
Installing and using Mellanox HPC-X Software Toolkit
Overview
Taken from Mellanox HPC-X Software Toolkit User Manual 2.3
Mellanox HPC-X is a comprehensive software package that includes MPI and SHMEM communication libraries. HPC-X includes various acceleration packages to improve both the performance and scalability of applications running on top of these libraries, including UCX (Unified Communication X) and MXM (Mellanox Messaging), which accelerate the underlying send/receive (or put/get) messages. It also includes FCA (Fabric Collectives Accelerations), which accelerates the underlying collective operations used by the MPI/PGAS languages.
Download
https://www.mellanox.com/products/hpc-x-toolkit
Installation
% tar -xvf hpcx-v2.6.0-gcc-MLNX_OFED_LINUX-5.0-1.0.0.0-redhat7.6-x86_64.tbz
% cd hpcx-v2.6.0-gcc-MLNX_OFED_LINUX-5.0-1.0.0.0-redhat7.6-x86_64
% export HPCX_HOME=/usr/local/hpcx-v2.6.0-gcc-MLNX_OFED_LINUX-5.0-1.0.0.0-redhat7.6-x86_64
Loading HPC-X Environment from BASH
HPC-X includes Open MPI v4.0.x. Each Open MPI version has its own module file which can be used to load the desired version
% source $HPCX_HOME/hpcx-init.sh % hpcx_load % env | grep HPCX % mpicc $HPCX_MPI_TESTS_DIR/examples/hello_c.c -o $HPCX_MPI_TESTS_DIR/examples/hello_c % mpirun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_c % oshcc $HPCX_MPI_TESTS_DIR/examples/hello_oshmem_c.c -o $HPCX_MPI_TESTS_DIR/examples/ % hello_oshmem_c % oshrun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_oshmem_c % hpcx_unload
Loading HPC-X Environment from Modules
You can use the already built module files in hpcx.
% module use $HPCX_HOME/modulefiles % module load hpcx % mpicc $HPCX_MPI_TESTS_DIR/examples/hello_c.c -o $HPCX_MPI_TESTS_DIR/examples/hello_c % mpirun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_c % oshcc $HPCX_MPI_TESTS_DIR/examples/hello_oshmem_c.c -o $HPCX_MPI_TESTS_DIR/examples/ hello_oshmem_c % oshrun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_oshmem_c % module unload hpcx
Building HPC-X with the Intel Compiler Suite
Do take a look at the Mellanox HPC-X® ScalableHPC Software Toolkit
References:
Fabric Debug Initiation using ibdiagnet (Part 1)
Learn some of these steps from Mellanox Academy Online Training
Step 1: Clear all counters and begin the test execution
ibdiagnet -pc
Wait for a while. Usually, it may take 30 to 60 mins
Check for errors that exceed the allowed threshold
ibdiagnet -ls 25 -lw 4x -P all=1 --pm_pause_time 30
- Specify the link speed
-ls <2.5|5|10|14|25|50> - Specify the Link width
-lw <1x|4x|8x|12x> - Check Information provide from all counters and display each one of them crossing threshold of 1
-P all=1 - The time between the two samples is set by the –pm_pause_time option
Webinar – Build the Most Powerful Data Center with GPU Computing Technology and High-speed Interconnect
Build the Most Powerful Data Center with GPU Computing Technology and High-speed Interconnect
Date: Thursday, June 11, 2020
Time: 11:00am-12:30pm Singapore Time
Please join NVIDIA as we discuss how to design a well-balanced system that maximizes performance and scalability of various workloads using NVIDIA GPUs and interconnect
Speakers will provide an overview of the state-of-the-art NVIDIA GPU accelerated compute architecture and In-Network computing fabric and how they come together with one goal: to deliver a solution that democratizes supercomputing power, making it readily accessible, installable, and manageable in a modern business setting. To learn more about this webinar click here