Installing RoCE using Mellanox (Nvidia) OFED package

Prerequisites:

Do read Basic Understanding RoCE and Infiniband

Step 1: Install Mellanox Package

First and Foremost, you have to install Mellanox Package which you can download at https://developer.nvidia.com/networking/ethernet-software. You may want to consider installing using the traditional method or Ansible Method (Installing Mellanox OFED (mlnx_ofed) packages using Ansible)

Step 2: Load the Drivers

Activate two kernel modules that are needed for rdma and RoCE exchanges by using the command

# modprobe rdma_cm ib_umad

Step 3: Verify the drivers are loaded

# ibv_devinfo

Step 4: Set the RoCE to version 2

Set the version of the RoCE protocol to v2 by issuing the command below.

  • -d is the device, 
  • -p is the port 
  • -m the version of RoCE:
[root@node1]# cma_roce_mode -d mlx5_0 -p 1 -m 2
RoCE v2

Step 5: Check which RoCE devices are enabled on the Ethernet

[root@node-1]# ibdev2netdev
mlx5_0 port 1 ==> ens1f0 (Up)
mlx5_1 port 1 ==> ens1f1 (Down)

Refererences:

  1. Setting up a RoCE cluster

In-Network Computing with NVIDIA SHARP

Traditional methods for performing data reductions are very costly in terms of latency and CPU cycles. The NVIDIA Quantum InfiniBand switch with NVIDIA SHARP technology addresses complex operations such as data reduction in a simplified, efficient way. By reducing data within the switch network, NVIDIA Quantum switches perform the reduction in a fraction of the time of traditional methods.

Basic Commands for Mellanox Network Switches for Break-out-Ports

More information can be found at Command Line Interface (CLI)

Point 1: To configure Break-Out

> enable
# configure terminal
# interface ethernet ?
R2-R8-LEAF01 [standalone: master] (config) # interface ethernet ?
<Device/Port>[-<Device/Port>]
1/1/1
1/1/2
1/1/3
1/1/4
1/3/1
1/3/2
1/3/3
1/3/4
1/5/1
1/5/2
1/5/3
1/5/4
1/7/1
1/7/2
1/7/3
1/7/4
1/9/1
1/9/2
1/9/3
1/9/4
.....
.....
1/25
1/26
1/27
1/28
1/29
1/30
1/31
1/32
# interface ethernet 1/25 shutdown
# interface ethernet 1/26 shutdown
# interface ethernet 1/25
# (config interface ethernet 1/25) # module-type qsfp-split-4 force

The resulting interface will become

Ethernet 1/25/1
Ethernet 1/25/2
Ethernet 1/25/3
Ethernet 1/25/4

Speed configuration can be found at

interface ethernet 1/25/1
# speed 25G

EOL notice for Mellanox ConnectX-5 VPI host channel adapters and Switch-IB 2 based EDR InfiniBand Switches

Nvidia Corporation has announced the EOL Notice #LCR-000906 – MELLANOX

PCN INFORMATION:
PCN Number: LCR-000906 – MELLANOX
PCN Description: EOL notice for Mellanox ConnectX-5 VPI host channel adapters and Switch-IB 2 based EDR InfiniBand Switches
Publish Date: Sun May 08 00:00:00 GMT 2022
Type: FYI

Top 500 Interconnect Trends

Published twice a year and publicly available at www.top500.org, the TOP500 supercomputing list ranks the world’s most powerful computer systems according to the Linpack benchmark rating system.

Taken from Nvidia Networking

Summary of Findings for Nvidia Networking.

  • NVIDIA GPU or Network (InfiniBand, Ethernet) accelerate 342 systems or 68% of overall TOP500 systems
  • InfiniBand accelerates seven of the top ten supercomputers in the world
  • NVIDIA BlueField DPU and HDR InfiniBand Networking accelerate the world’s 1st academic cloud-native supercomputer at Cambridge University
  • NVIDIA InfiniBand and Ethernet networking solutions connect 318 systems or 64% of overall TOP500 platforms
  • InfiniBand accelerates 170 systems, 21% growth compared to June 2020 TOP500 list
  • InfiniBand accelerates #1, #2 supercomputers in the US, #1 in China, #1, #2 and #3 in Europe
  • NVIDIA 25 gigabit and faster Ethernet solutions connect 62% of total Ethernet systems

What is the difference between a DPU, a CPU, and a GPU?

An interesting blog to explain what is the difference a DPU, CPU, and GPU?

 

So What Makes a DPU Different?

A DPU is a new class of programmable processor that combines three key elements. A DPU is a system on a chip, or SOC, that combines:
An industry standard, high-performance, software programmable, multi-core CPU, typically based on the widely-used Arm architecture, tightly coupled to the other SOC components

A high-performance network interface capable of parsing, processing, and efficiently transferring data at line rate, or the speed of the rest of the network, to GPUs and CPUs

A rich set of flexible and programmable acceleration engines that offload and improve applications performance for AI and Machine Learning, security, telecommunications, and storage, among others.

For more information, do take a look at What’s a DPU? …And what’s the difference between a DPU, a CPU, and a GPU?

Best Practices to Secure the Edge Cloud Environment

In this webinar you will learn:

  • Challenges in securing edge data centers
  • How to secure the edge cloud without compromising on application performance
  • The role of NVIDIA Mellanox DPU in securing cloud to edge

Date: Aug 4, 2020
Time: 2:00pm SGT | 11:30am IST | 4:00pm AEST

To register: https://www.mellanox.com/webinar/best-practices-secure-edge-cloud-environment

 

Installing and using Mellanox HPC-X Software Toolkit

Overview

Taken from Mellanox HPC-X Software Toolkit User Manual 2.3

Mellanox HPC-X is a comprehensive software package that includes MPI and SHMEM communication libraries. HPC-X includes various acceleration packages to improve both the performance and scalability of applications running on top of these libraries, including UCX (Unified Communication X) and MXM (Mellanox Messaging), which accelerate the underlying send/receive (or put/get) messages. It also includes FCA (Fabric Collectives Accelerations), which accelerates the underlying collective operations used by the MPI/PGAS languages.

Download

https://www.mellanox.com/products/hpc-x-toolkit

Installation

% tar -xvf hpcx-v2.6.0-gcc-MLNX_OFED_LINUX-5.0-1.0.0.0-redhat7.6-x86_64.tbz
% cd hpcx-v2.6.0-gcc-MLNX_OFED_LINUX-5.0-1.0.0.0-redhat7.6-x86_64
% export HPCX_HOME=/usr/local/hpcx-v2.6.0-gcc-MLNX_OFED_LINUX-5.0-1.0.0.0-redhat7.6-x86_64

Loading HPC-X Environment from BASH

HPC-X includes Open MPI v4.0.x. Each Open MPI version has its own module file which can be used to load the desired version

% source $HPCX_HOME/hpcx-init.sh
% hpcx_load
% env | grep HPCX
% mpicc $HPCX_MPI_TESTS_DIR/examples/hello_c.c -o $HPCX_MPI_TESTS_DIR/examples/hello_c
% mpirun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_c
% oshcc $HPCX_MPI_TESTS_DIR/examples/hello_oshmem_c.c -o $HPCX_MPI_TESTS_DIR/examples/
% hello_oshmem_c
% oshrun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_oshmem_c
% hpcx_unload

Loading HPC-X Environment from Modules

You can use the already built module files in hpcx.

% module use $HPCX_HOME/modulefiles
% module load hpcx
% mpicc $HPCX_MPI_TESTS_DIR/examples/hello_c.c -o $HPCX_MPI_TESTS_DIR/examples/hello_c
% mpirun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_c
% oshcc $HPCX_MPI_TESTS_DIR/examples/hello_oshmem_c.c -o $HPCX_MPI_TESTS_DIR/examples/
hello_oshmem_c
% oshrun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_oshmem_c
% module unload hpcx

Building HPC-X with the Intel Compiler Suite

Do take a look at the Mellanox HPC-X® ScalableHPC Software Toolkit

References:

  1. Mellanox HPC-X Software Toolkit User Manual 2.3
  2. Mellanox HPC-X® ScalableHPC Software Toolkit

Fabric Debug Initiation using ibdiagnet (Part 1)

Learn some of these steps from Mellanox Academy Online Training

Step 1: Clear all counters and begin the test execution

ibdiagnet -pc

Wait for a while. Usually, it may take 30 to 60 mins

Check for errors that exceed the allowed threshold

ibdiagnet -ls 25 -lw 4x -P all=1 --pm_pause_time 30
  • Specify the link speed
    -ls <2.5|5|10|14|25|50> 
  • Specify the Link width
    -lw <1x|4x|8x|12x>
  • Check Information provide from all counters and display each one of them crossing threshold of 1
    -P all=1
  • The time between the two samples is set by the –pm_pause_time option

Webinar – Build the Most Powerful Data Center with GPU Computing Technology and High-speed Interconnect

Build the Most Powerful Data Center with GPU Computing Technology and High-speed Interconnect

Date: Thursday, June 11, 2020
Time: 11:00am-12:30pm Singapore Time

Register here 

Please join NVIDIA as we discuss how to design a well-balanced system that maximizes performance and scalability of various workloads using NVIDIA GPUs and interconnect

Speakers will provide an overview of the state-of-the-art NVIDIA GPU accelerated compute architecture and In-Network computing fabric and how they come together with one goal: to deliver a solution that democratizes supercomputing power, making it readily accessible, installable, and manageable in a modern business setting. To learn more about this webinar click here