Issues when Installing Dockers on Rocky Linux 8.10

I was installing dockers on Rocky Linux 8.10. These were my steps:

dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
dnf install docker-ce docker-ce-cli containerd.io

I immediately got this error…..

Error: 
 Problem 1: problem with installed package podman-4:4.9.4-1.module+el8.10.0+1815+5fe7415e.x86_64
  - package podman-4:4.9.4-1.module+el8.10.0+1815+5fe7415e.x86_64 from @System requires runc >= 1.0.0-57, but none of the providers can be installed
  - package podman-4:4.9.4-1.module+el8.10.0+1815+5fe7415e.x86_64 from appstream requires runc >= 1.0.0-57, but none of the providers can be installed
  - package podman-4:4.9.4-1.module+el8.10.0+1825+623b0c20.x86_64 from appstream requires runc >= 1.0.0-57, but none of the providers can be installed
  - package podman-4:4.9.4-12.module+el8.10.0+1843+6892ab28.x86_64 from appstream requires runc >= 1.0.0-57, but none of the providers can be installed
  - package podman-4:4.9.4-13.module+el8.10.0+1871+e6fa1069.x86_64 from appstream requires runc >= 1.0.0-57, but none of the providers can be installed
  - package podman-4:4.9.4-13.module+el8.10.0+1874+ce489889.x86_64 from appstream requires runc >= 1.0.0-57, but none of the providers can be installed

To resolve the issues, do add the --allowerasing flag,

dnf install docker-ce docker-ce-cli containerd.io --allowerasing
================================================================================
 Package                   Arch   Version                Repository        Size
================================================================================
Installing:
 containerd.io             x86_64 1.6.32-3.1.el8         docker-ce-stable  35 M
     replacing  runc.x86_64 1:1.1.12-1.module+el8.10.0+1815+5fe7415e
 docker-ce                 x86_64 3:26.1.3-1.el8         docker-ce-stable  27 M
 docker-ce-cli             x86_64 1:26.1.3-1.el8         docker-ce-stable 7.8 M
Installing dependencies:
 libcgroup                 x86_64 0.41-19.el8            baseos            69 k
Installing weak dependencies:
 docker-buildx-plugin      x86_64 0.14.0-1.el8           docker-ce-stable  14 M
 docker-ce-rootless-extras x86_64 26.1.3-1.el8           docker-ce-stable 5.0 M
 docker-compose-plugin     x86_64 2.27.0-1.el8           docker-ce-stable  13 M
Removing dependent packages:
 buildah                   x86_64 1:1.34.0-1.module+el8.10.0+1815+5fe7415e
                                                         @AppStream        31 M
 cockpit-podman            noarch 84.1-1.module+el8.10.0+1815+5fe7415e
                                                         @AppStream       682 k
 containers-common         x86_64 2:1-81.module+el8.10.0+1815+5fe7415e
                                                         @AppStream       580 k
 podman                    x86_64 4:4.9.4-1.module+el8.10.0+1815+5fe7415e
                                                         @AppStream        52 M
 podman-catatonit          x86_64 4:4.9.4-1.module+el8.10.0+1815+5fe7415e
                                                         @AppStream       794 k

Transaction Summary
================================================================================
Install  7 Packages
Remove   5 Packages

Total download size: 102 M
Is this ok [y/N]: y

References:

Unable to run hydra_bstrap_proxy when using mpiexec

If you are facing an issue similar to this error and the reasons provided are:

  1. Host is unavailable. Please check that all hosts are available.
  2. Cannot launch hydra_bstrap_proxy or it crashed on one of the hosts. Make sure hydra_bstrap_proxy is available on all hosts and it has right permissions.
  3. Firewall refused connection. Check that enough ports are allowed in the firewall and specify them with the I_MPI_PORT_RANGE variable.
  4. pbs bootstrap cannot launch processes on remote host. You may try using -bootstrap option to select alternative launcher.
[mpiexec@hpc-node1] check_exit_codes (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:117): unable to run bstrap_proxy on hpc-npriv-g001 (pid 2778558, exit code 256)
[mpiexec@hpc-node1] poll_for_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:159): check exit codes error
[mpiexec@hpc-node1] HYD_dmx_poll_wait_for_proxy_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:212): poll for event error
[mpiexec@hpc-node1] HYD_bstrap_setup (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:1065): error waiting for event
[mpiexec@hpc-node1] HYD_print_bstrap_setup_error_message (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:1027): error setting up the bootstrap proxies
[mpiexec@hpc-node1] Possible reasons:
[mpiexec@hpc-node1] 1. Host is unavailable. Please check that all hosts are available.
[mpiexec@hpc-node1] 2. Cannot launch hydra_bstrap_proxy or it crashed on one of the hosts. Make sure hydra_bstrap_proxy is available on all hosts and it has right permissions.
[mpiexec@hpc-node1] 3. Firewall refused connection. Check that enough ports are allowed in the firewall and specify them with the I_MPI_PORT_RANGE variable.
[mpiexec@hpc-node1] 4. pbs bootstrap cannot launch processes on remote host. You may try using -bootstrap option to select alternative launcher.

The Solution is found by modifying your mpiexec commands

$ mpiexec -bootstrap ssh ......

For example

$ mpiexec -bootstrap ssh python3 python.text

Alternatively, you can put the line in your .bashrc or PBS Script

export I_MPI_HYDRA_BOOTSTRAP=ssh

References:

Issues installing using RHEL8 repo

The Forum found here helped with me RHEL8 repo. I was using Rocky Linux 8.10

dnf install cuda
Last metadata expiration check: 2:34:35 ago on Tue 14 Jan 2025 08:28:15 AM +08.
Error: 
 Problem: package cuda-12.6.3-1.x86_64 from cuda-rhel8-x86_64 requires nvidia-open >= 560.35.05, but none of the providers can be installed
  - cannot install the best candidate for the job
  - package nvidia-open-3:560.28.03-1.noarch from cuda-rhel8-x86_64 is filtered out by modular filtering
  - package nvidia-open-3:560.35.03-1.noarch from cuda-rhel8-x86_64 is filtered out by modular filtering
  - package nvidia-open-3:560.35.05-1.el8.noarch from cuda-rhel8-x86_64 is filtered out by modular filtering
  - package nvidia-open-3:565.57.01-1.el8.noarch from cuda-rhel8-x86_64 is filtered out by modular filtering
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)

In the forum, there was a workaround solution

Go to the download page and select the V100 driver.

dnf install. ./nvidia..........x86_64.rpm

Remove the old cuda if you have it installed and reset the repo module streams.

dnf remove cuda-toolkit nvidia-driver-cuda
dnf module reset nvidia-driver

Install dkms and cuda

dnf module install nvidia-driver:latest-dkms 
dnf install cuda-toolkit nvidia-driver-cuda

Alternatively, this method works especially if you have not manually install the drivers or manually uninstall it already.

dnf install cuda-toolkit nvidia-driver-cuda

Using grubby to configure bootloader menu for Rocky Linux 8

grubby is a command line tool to configure bootloader menu entries for Linux. Here are some commands which you may find useful

List Kernels

# grubby --info=ALL | grep ^kernel
kernel="/boot/vmlinuz-4.18.0-553.16.1.el8_10.x86_64"
kernel="/boot/vmlinuz-4.18.0-513.18.1.el8_9.x86_64"
kernel="/boot/vmlinuz-4.18.0-425.3.1.el8.x86_64"
kernel="/boot/vmlinuz-0-rescue-1fd272f10209466d81c89276e275d210"

Check Defaults Loading

# grubby --default-kernel
/boot/vmlinuz-4.18.0-553.16.1.el8_10.x86_64
# grubby --default-index
0

Change Default Loading

# grubby --set-default="/boot/vmlinuz-4.18.0-513.18.1.el8_9.x86_64"

Verify Default Loading

# grubby --default-kernel
/boot/vmlinuz-4.18.0-513.18.1.el8_9.x86_64
# grubby --default-index
1

Installing Octopus-15.0.0 with OpenMPI on Rocky Linux 8

This is an update to the blog entry Basic Configuration of Octopus 5.0.0 with OpenMPI on CentOS 6

Prerequisites:

  • GNU Compilers – 12.3
  • OpenMPI – 4.1.5
  • FFTW – 3.3.10
  • LAPACK/BLAS – (Comes with Rocky Linux 8)
  • GSL – 2.7.1

To install Octopus using autoconf, you will need to dnf install the autoconf, automaker, autogen packages

dnf install autoconf automake autogen

Preparing the Configure file using Autoreconf tools

After downloading from https://octopus-code.org/documentation/15/releases/ and unzip and untar, you must prepare the environment to generate the configure file. Do take a look at INSTALL and README files.

autoreconf --install

Prepare the PATH and LD_LIBRARY_PATH Environment

If you are using Module Environment, it will be much easier, if not, you have to configure $PATH and $LD_LIBRARY_PATH

export PATH=$PATH:/usr/local/openmpi-4.1.5/bin:...........
export LD_LIBRARY_PATH: $LD_LIBRARY_PATH: /usr/local/openmpi-4.1.5/lib...................

export FC=mpif90
export CC=mpicc
export FCFLAGS="-O3"
export CFLAGS="-O3"

Prepare the Octopus Setup Environment

./configure 
--prefix=/usr/local/octopus-15.0.0  \
--with-libxc-prefix=/usr/local/libxc-6.2.2 \
--with-libxc-include=/usr/local/libxc-6.2.2/include \
--with-gsl-prefix=/usr/local/gsl-2.7.1 \
--with-blas=/usr/lib64/libblas.a \ 
--with-arpack=/usr/lib64/libarpack.so.2 \ 
--with-fft-lib="-L/usr/local/fftw-3.3.10/lib" \
--disable-zdotc-test \
--enable-single \
--enable-mpi
make -j 16
make install

Adding cgroups control to GPGPU Servers for PSB-Professional

After adding GPGPU node to PBS Professional, you have to make sure, it is in the right queue first

qmgr -c "set node gpu-node resources_available.Qlist = gpu_v100"

Locate the cgroups.json2 script in the directory you have placed in. Check by doing the following

ll cgroups.json2

If there, edit the file.

vim cgroups.json2

Find the place where the “run_only_on_hosts” and add the node

"run_only_on_hosts" : [ "gpu-node1", "gpu-node2", "gpu-node3", "gpu-node4],
        "cgroup":
......
......
......

Use the qmgr to import the file

qmgr -c "import hook cgroups application/x-config default cgroups.json2"

Check that the PBS has detected the node correctly

pbsnodes -aSj |grep gpu-node1

Disabling ipv6 on Rocky Linux 8 with Ansible

If you wish to disable ipv6 on Rocky Linux 8, there is a wonderful writeup on the script found at https://github.com/juju4/ansible-ipv6/blob/main/tasks/ipv6-disable.yml which you may find useful. If you just need to disable it temporarily without disruption (assuming you have not been using ipv6 at all)

- name: Disable IPv6 with sysctl
  ansible.posix.sysctl:
    name: "{{ item }}"
    value: "1"
    state: "present"
    reload: "yes"
  with_items:
    - net.ipv6.conf.all.disable_ipv6
    - net.ipv6.conf.default.disable_ipv6
    - net.ipv6.conf.lo.disable_ipv6

If you can tolerate a bit of disruption, you may want to take a look at putting it at the network configuration and restarting it

- name: RedHat | disable ipv6 in sysconfig/network
  ansible.builtin.lineinfile:
    dest: /etc/sysconfig/network
    regexp: "^{{ item.regexp }}"
    line: "{{ item.line }}"
    mode: '0644'
    backup: true
    create: true
  with_items:
    - { regexp: 'NETWORKING_IPV6=.*', line: 'NETWORKING_IPV6=NO' }
    - { regexp: 'IPV6INIT=.*', line: 'IPV6INIT=no' }
  notify:
    - Restart network
    - Restart NetworkManager
  when: ansible_os_family == 'RedHat'