Installing Nvidia DOCA OFED Documentation from Nvidia for Rocky Linux

Taken from Installing Nvidia DOCA OFED. Do read the documentation for more information. Other relevan documentation will include

Quick Reference

Installation Profiles

DOCA-Host ProfileDescription
doca-ofedAllows you to install the same drivers and tools of MLNX_OFED using the DOCA-Host package, but without other DOCA functionality.
doca-networkIntended for users who want to use only the networking functionality of the DOCA-Host package.
doca-allIntended for users who want to use the full extent of DOCA drivers and libraries, the full DOCA-Host installation.
# Remove the installed DOCA OFED software from the host.
for f in $(rpm -qa | grep -i doca ) ; do sudo yum -y remove $f; done

# Remove the installed MLNC_OFED software.
sudo /usr/sbin/ofed_uninstall.sh --force

sudo dnf autoremove
sudo dnf clean all -y
sudo dnf makecache -y

Download and Install NVidia RPM GPG Key

sudo wget http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox-SHA256
sudo rpm --import RPM-GPG-KEY-Mellanox-SHA256

DOCA-OFED

At /etc/yum.repos.d/

touch /etc/yum.repos.d/doca.repo

Inside /etc/yum.repos.d/doca.repo, include the information

[doca]
name=DOCA Online Repo
baseurl=https://linux.mellanox.com/public/repo/doca/3.2.1/rhel8/x86_64/
enabled=1
gpgcheck=0" > /etc/yum.repos.d/doca.repo

Save and Exit

Install DOCA-OFED

dnf install -y doca-ofed

Validating that OFED and ROCEV2 are working

One of the fastest commands is to use ibstat

CA 'mlx5_0'
	CA type: MT4127
	Number of ports: 1
	Firmware version: 26.43.2026
	Hardware version: 0
	Node GUID: 0x5000e6030073b514
	System image GUID: 0x5000e6030073b514
	Port 1:
		State: Down
		Physical state: Disabled
		Rate: 40
		Base lid: 0
		LMC: 0
		SM lid: 0
		Capability mask: 0x.....
		Port GUID: 0x......
		Link layer: Ethernet
CA 'mlx5_1'
	CA type: MT4127
	Number of ports: 1
	Firmware version: 26.43.2026
	Hardware version: 0
	Node GUID: 0x.....
	System image GUID: 0x.....
	Port 1:
		State: Active
		Physical state: LinkUp
		Rate: 25
		Base lid: 0
		LMC: 0
		SM lid: 0
		Capability mask: 0x.......
		Port GUID: 0x.....
		Link layer: Ethernet

You can use the following information to check further. Installing RoCE using Mellanox (Nvidia) OFED package

Checking Assigned Logical Name to Hardware Brand

Method 1: Using Ethernet and lspci

[root@hpc-node1 ~]# ethtool -i ens3f1np1
driver: mlx5_core
version: 25.10-1.7.1
firmware-version: 26.43.2026 (MT_0000000575)
expansion-rom-version: 
bus-info: 0000:5d:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
[root@hpc-node1 ~]lspci -s 0000:5d:00.1
0000:5d:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]

Method 2: Using Ishw

[root@hpc-wfly-i022 ~]# lshw -C network
.....
.....
*-network:1
       description: Ethernet interface
       product: MT2894 Family [ConnectX-6 Lx]
       vendor: Mellanox Technologies
       physical id: 0.1
       bus info: pci@0000:5d:00.1
       logical name: ens3f1
       version: 00
       serial: 50:00:e6:73:b5:15
       capacity: 10Gbit/s
       width: 64 bits
       clock: 33MHz
       capabilities: pciexpress vpd msix pm bus_master cap_list rom ethernet physical fibre 1000bt-fd 10000bt-fd autonegotiation
       configuration: autonegotiation=on broadcast=yes driver=mlx5_core driverversion=4.18.0-553.54.1.el8_10.x86_64 firmware=26.43.2026 (MT_0000000575) latency=0 link=no multicast=yes port=fibre
       resources: iomemory:1f3f0-1f3ef irq:17 memory:1f3ffa000000-1f3ffbffffff memory:b5f00000-b5ffffff
.....
.....

Installing RoCE using Mellanox (Nvidia) OFED package

Prerequisites:

Do read Basic Understanding RoCE and Infiniband

Step 1: Install Mellanox Package

First and Foremost, you have to install Mellanox Package which you can download at https://developer.nvidia.com/networking/ethernet-software. You may want to consider installing using the traditional method or Ansible Method (Installing Mellanox OFED (mlnx_ofed) packages using Ansible)

Step 2: Load the Drivers

Activate two kernel modules that are needed for rdma and RoCE exchanges by using the command

# modprobe rdma_cm ib_umad

Step 3: Verify the drivers are loaded

# ibv_devinfo

Step 4: Set the RoCE to version 2

Set the version of the RoCE protocol to v2 by issuing the command below.

  • -d is the device, 
  • -p is the port 
  • -m the version of RoCE:
[root@node1]# cma_roce_mode -d mlx5_0 -p 1 -m 2
RoCE v2

Step 5: Check which RoCE devices are enabled on the Ethernet

[root@node-1]# ibdev2netdev
mlx5_0 port 1 ==> ens1f0 (Up)
mlx5_1 port 1 ==> ens1f1 (Down)

Refererences:

  1. Setting up a RoCE cluster

Basic Understanding RoCE and Infiniband

Prerequisites:

  1. RoCE required Compliant Ethernet. Currently, I am using Mellanox ConnectX-6 Cards
  2. RoCE required a Compliant Switch. I used Mellanox 100G Switch.

The Difference between Traditional Ethernet Communication and RoCE can be explained very clearly in the diagram taken by Huawei’s Basic Knowledge and Differences of RoCE, IB, and TCP Networks

Some Key Pointers on the difference between TCP/IP and RDMA

  1. The Traditional TCP/IP network communication uses the Kernel to send messages which have high data movement and data replication overhead.
  2. RDMA can bypass the kernel and access the memory directly which allows low-latency network communication.

There are 3 types of RDMA network technologies is so neatly presented in Basic Knowledge and Differences of RoCE, IB, and TCP Networks

References:

  1. Basic Knowledge and Differences of RoCE, IB, and TCP Networks