By default, Ansible parallelises tasks on multiple hosts simultaneously and speeds up automation in large inventories. But sometimes, this is not ideal in a load-balanced environment, where upgrading the servers simultaneously may cause the loss of services. How do we use Ansible to run the updates at different times? I use the keyword “serial” before executing the roles universal package.
Ansible’s parallel processes are known as forks, and the default number of forks is five. In other words, Ansible attempts to run automation jobs on 5 hosts simultaneously. The more forks you set, the more resources are used on the Ansible control node.
How do you implement? Just edit the ansible.cfg file. Look for the “forks” parameters. You can use the command “ansible-config view” to view ansible.cfg output.
Rescue blocks specify tasks to run when an earlier task in a block fails. This approach is similar to exception handling in many programming languages. Ansible only runs rescue blocks after a task returns a ‘failed’ state. Bad task definitions and unreachable hosts will not trigger the rescue block.
If you have read the blog entry Using Ansible to automate Security Patch on Rocky Linux 8, you may want to consider capturing the logs and send notification to MS-Team if you are using that as a Communication Channel. This is a follow-up to that blog.
Notification (Option 1: Ansible Command used if just checking)
You can write to MS Team to provide a short notification to let the Engineers knows that the logs has been written to /var/log/ansible_logs
- name: Send a notification to MS-Teams that Test Run (No Patching) is completed run_once: true uri: url: "https://xxxxxxx.webhook.office.com/webhookb2/xxxxxxxxxxxxxxxxxxxxxxxxx" method: POST body_format: json body: title: "Test Patch Run on {{ansible_date_time.date}}" text: "Test Run only. System has not been Patched Yet. Logs saved at: /var/log/ansible_logs/patch-list_{{ansible_date_time.date}}.log" when: - register_update_success is defined - ext_permit_flag == "no"
Writing to MS-Team to capture the success Or failure of the Update (Option 2: Ansible Command used when ready for Patching)
- name: Send a notification to MS-Teams Channel if Upgrade failed run_once: true uri: url: "https://xxxxx.webhook.office.com/webhookb2/xxxxxx" method: POST body_format: json body: title: "Patch Run on {{ansible_date_time.date}}" text: "Patch Update has Failed" when: - register_update_success is not defined - ext_permit_flag == "yes"
- name: Send a notification to MS-Teams Channel if Upgrade failed run_once: true uri: url: "https://entuedu.webhook.office.com/webhookb2/xxxxxx" method: POST body_format: json body: title: "Patch Run on {{ansible_date_time.date}}" text: "Patch Update is Successful. Logs saved at: /var/log/ansible_logs/patch-list_{{ansible_date_time.date}}.log" when: - register_update_success is defined - ext_permit_flag == "yes"
If you intend to use Ansible to patch the Server, you may need to use an external variable to decide whether you wish to take a look at the list or actually patch the OS. It consists of 3 parts.
Ansible is great for configuring host-based firewall like Firewalld. One thing you will note is that we are using with_items parameter a lot and it is very useful in this case since we have a number of parameters within items.
If you just want to automate the Linux portion, here is something you may wish to consider.
Update the sshd_config Templates (The most important portion is that the “PasswordAuthentication no” and “ChallengeResponseAuthentication yes” is present. The whole sshd_config template is too large for me to put into the blog.
.....
.....
# To disable tunneled clear text passwords, change to no here!
#PasswordAuthentication yes
#PermitEmptyPasswords no
PasswordAuthentication no
# Change to no to disable s/key passwords
#ChallengeResponseAuthentication yes
ChallengeResponseAuthentication yes
.....
.....
Step 3: Install the Kernel-Headers and Kernel-Devel
The CUDA Driver requires that the kernel headers and development packages for the running version of the kernel be installed at the time of the driver installation, as well as whenever the driver is rebuilt.
To install the Display Driver, the Nouveau drivers must first be disabled. I use a template to disable it. I created a template called blacklist-nouveau-conf.j2. Here is the content
blacklist nouveau options nouveau modeset=0
The Ansible script for disabling Noveau using a template
Step 6: Reboot if there are changes to Drivers and CUDA
- name: Reboot if there are changes to Drivers or CUDA
ansible.builtin.reboot:
when:
- install_driver.changed or install_cuda.changed
- ansible_os_family == "RedHat"
- ansible_distribution_major_version == "8"
Aftermath
After reboot, you should try to do “nvidia-smi” commands, hopefully, you should see