Installing Mellanox OFED (mlnx_ofed) packages using Ansible

If you are planning to use ansible to install mlnx_ofed Packages to the compute nodes which have IB or RoCE Ethernet Card. The comprehensive documentation can be found at Installing Mellanox OFED

Step 1: Download Mellanox OFED Drivers

Download the .tar.gz file from Nvidia Networking Ethernet Download site

Step 2: Untar the mlnx_ofed packages on the Shared drive.

Supposedly, the Cluster is sharing the /usr/local/ within the cluster.

# mkdir /usr/local/mlnx_ofed
# cp MLNX_OFED_LINUX-23.04-1.1.3.0-rhel8.7-x86_64.tgz /usr/local/mlnx_ofed
# cd /usr/local/mlnx_ofed
# tar -zxvf MLNX_OFED_LINUX-23.04-1.1.3.0-rhel8.7-x86_64.tgz
# cd MLNX_OFED_LINUX-23.04-1.1.3.0-rhel8.7-x86_64

Step 3: Create a Template mlnx_ofed.repo.j2 and update the content

[mlnx_ofed]
name=MLNX_OFED Repository
baseurl=file:///usr/local/mlnx_ofed/MLNX_OFED_LINUX-23.04-1.1.3.0-rhel8.7-x86_64/RPMS
enabled=1
gpgkey=file:///usr/local/mlnx_ofed/MLNX_OFED_LINUX-23.04-1.1.3.0-rhel8.7-x86_64/RPM-GPG-KEY-Mellanox
gpgcheck=1

Step 4: Create a Playbook for updating the drivers

- name: Generate /etc/yum.repos.d/mlnx_ofed.repo
  template:
      src: ../templates/mlnx_ofed.repo.j2
      dest: /etc/yum.repos.d/mlnx_ofed.repo
      owner: root
      group: root
      mode: 0644
  become: true
  when: 
    - ansible_os_family == "RedHat"
    - ansible_distribution_major_version == "8"
    - ansible_distribution_version == "8.7"


- name: Install mlnx-ofed-all
  dnf:
      name:
        - mlnx-ofed-all
      state: latest
  when: 
    - ansible_os_family == "RedHat"
    - ansible_distribution_major_version == "8"
    - ansible_distribution_version == "8.7"
  register: install_mlnx

Step 5: Reboot if there are changes to MLNX-OFED

- name: Reboot if there are changes to MLNX-OFED
  ansible.builtin.reboot:
  when:
    - install_mlnx.changed
    - ansible_os_family == "RedHat"
    - ansible_distribution_major_version == "8"
    - ansible_distribution_version == "8.7"

- name: Modprobe rdma_cm ib_umad
  ansible.builtin.shell: "modprobe rdma_cm ib_umad"
  when: install_mlnx.changed

References:

  1. Installing Mellanox OFED

Intel Enters the Quantum Computing Horse Race With 12-Qubit Chip

Taken from Intel Enters the Quantum Computing Horse Race With 12-Qubit Chip

Intel has built a quantum processor called Tunnel Falls that it will offer to research labs hoping to make the revolutionary computing technology practical.

The Tunnel Falls processor, announced Thursday, houses 12 of the fundamental data processing elements called qubits. It’s a major step in the chipmaker’s attempt to develop quantum computing hardware it hopes will eventually surpass rivals.

Intel Enters the Quantum Computing Horse Race With 12-Qubit Chip

Gromacs Error – log: Protocol “https” not supported or disabled in libcurl

Downloading: https://ftp.gromacs.org/regressiontests/regressiontests-2020.6.tar.gz
CMake Error at tests/CMakeLists.txt:58 (message):
  error: downloading
  'https://ftp.gromacs.org/regressiontests/regressiontests-2020.6.tar.gz'
  failed

  status_code: 1

  status_string: "Unsupported protocol"

  log: Protocol "https" not supported or disabled in libcurl

  Closing connection -1

If you compiling Gromacs-2020.6 with Plumed2-2.7.2, do follow the Regression Test Errors during Gromacs Compilation. The Key issues are

DREGRESSIONTEST_DOWNLOAD=OFF -DREGRESSIONTEST_PATH=../regressiontests-2020.2

If you can download the regressiontests-2020.6.tar.gz and get the regressiontest PATH correct, it should work without issues.

Displaying the Number of Cores and Current Load average for All Nodes

If you wish to use Ansible to display the number of cores and current Load average for all your nodes, you may want to consider the code below.

- name: Display number of cores
  debug:
    var: ansible_processor_cores

- name: Get Load Average
  ansible.builtin.shell: "cat /proc/loadavg"
  register: load_avg_output
  changed_when: false

- name: Print Load Average for all Nodes
  debug:
    msg: "Load Average: {{ load_avg_output.stdout }}"

Updating /etc/resolv.conf using Ansible for Rocky Linux 8

You may want to check the whether /etc/resolv.conf exists and if not exist, create the file file and update the DNS

- name: Check if resolv.conf file exists
  stat:
      path: /etc/resolv.conf
  register: file_info

- name: Create /etc/resolv.conf if it exists
  file:
     path: /etc/resolv.conf
     state: touch
  when: not file_info.stat.exists

- name: Set DNS nameservers in /etc/resolv.conf
  blockinfile:
      path: /etc/resolv.conf
      block: |
            search example.com
            nameserver x.x.x.x
            nameserver w.w.w.w
  when: ansible_distribution == "Rocky"

Enable PowerTools Repository Using Ansible

If you wish to use Ansible to fix Unable to Install hdf5, hdf5-devel and hdf5-static on Rocky Linux 8.7 by installing DNG-Plugin-Core, EPEL-Release for Rocky Linux, do take a look

 - name: Install DNF-Plugin-Core and EPEL-Release for Rocky
    dnf:
        name: 
           - dnf-plugins-core 
           - epel-release  
        state: latest      
    when: ansible_distribution == "Rocky"

  - name: Enable powertools repository
    command: dnf config-manager --set-enabled powertools
    when: ansible_distribution == "Rocky"
    changed_when: false

Installing and Configuring Chrony with Ansible on Rocky 8

If you are using Ansible to configure chrony which is a versatile implementation of the Network Time Protocol (NTP), you may want to take a look at the simple script below

- hosts: all
  tasks:

  - name: Install Chrony package
    dnf:
        name: chrony
        state: present
    when: ansible_distribution == "Rocky"

  - name: Configure Chrony servers
    lineinfile:
        path: /etc/chrony.conf
        line: "server sg.pool.ntp.org iburst"
        insertafter: '^#.*server 3.centos.pool.ntp.org iburst'
        state: present
    when: ansible_distribution == "Rocky"

  - name: Enable Chrony service
    service:
        name: chronyd
        state: started
        enabled: yes
    when: ansible_distribution == "Rocky"

You may want to consider Block Function to improve the code.

Further Read Up:

  1. Grouping Tasks with Block in Ansible

Unable to Install hdf5, hdf5-devel and hdf5-static on Rocky Linux 8.7

If you are doing a dnf install on hdf5 packages, you will notice errors like the one below

nothing provides libsz.so.2()(64bit) needed by hdf5-1.10.5-4.el8.x86_64
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)

To resolve the issue, you will need to install and enable PowerTools

Step 1: Install DNF plugins package

dnf install dnf-plugins-core

Step 2: Install EPEL

The reason is that some software from its source code requires some dependencies that are available in EPEL

dnf install epel-release

Step 3: Enable PowerTools repository on Rocky Linux 8

dnf config-manager --set-enabled powertools

Step 4: Now try installing HDF5

dnf install hdf5 hdf5-devel hdf5-static