Pre-check before restarting the NSD Nodes

Before restarting the NSD Nodes or Quorum Manager Nodes or other critical nodes, do check the following first to ensure the file system is in the right order before restarting.

1. Make sure all three quorum nodes are active.

# mmgetstate -N quorumnodes

*If any machine is not active, do *not* proceed

2. Make sure file system is mounted on machines

# mmlsmount gpfs0

If the file system is not mounted somewhere, we should try to resolve it first.

Spectrum Scale User Group @ London (April)

There was a good and varied topics being discussed at the Spectrum Scale

Basic Tuning of RDMA Parameters for Spectrum Scale

If your cluster has symptoms of overload and GPFS kept reporting “overloaded” in GPFS logs like the ones below, you might get long waiters and sometimes deadlocks.

Wed Apr 11 15:53:44.232 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 15:55:24.488 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 15:57:04.743 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 15:58:44.998 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 16:00:25.253 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 16:28:45.601 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 16:33:56.817 2018: [N] sdrServ: Received deadlock notification from

Increase scatterBuffersize to a Number that match IB Fabric
One of the first tuning will be to tune the scatterBufferSize. According to the wiki, FDR10 can be tuned to 131072 and FDR14 can be tuned to 262144

The default value of 32768 may perform OK. If the CPU utilization on the NSD IO servers is observed to be high and client IO performance is lower than expected, increasing the value of scatterBufferSize on the clients may improve performance.

# mmchconfig scatterBufferSize=131072

There are other parameters which can be tuned. But the scatterBufferSize worked immediately for me.

Disable  verbsRdmaSend=no

# mmchconfig verbsRdmaSend=no -N nsd1,nsd2

Verify settings has taken place

# mmfsadm dump config | grep verbsRdmasPerNode

Increase verbsRdmasPerNode to 514 for NSD Nodes

# mmchonfig verbsRdmasPerNode=514 -N nsd1,nsd2


  1. Best Practices RDMA Tuning

Cannot initialize RDMA protocol on Cluster with Platform LSF

If you encounter this issue during an application run and your scheduler used is Platform LSF. There is a simple solution.


explicit_dp: Rank 0:13: MPI_Init_thread: didn't find active interface/port
explicit_dp: Rank 0:13: MPI_Init_thread: Can't initialize RDMA device
explicit_dp: Rank 0:13: MPI_Init_thread: Internal Error: Cannot initialize RDMA protocol
MPI Application rank 13 exited before MPI_Init() with status 1
mpirun: Broken pipe

In this case the amount of locked memory was set to unlimited in /etc/security/limits.conf, but this was not sufficient.
The MPI jobs were started under LSF, but the lsf daemons were started with very small memory locked limits.

Set the amount of locked memory to unlimited in /etc/init.d/lsf by adding the ‘ulimit -l unlimited’ command.

ulimit -l unlimited
. /opt/lsf/conf/profile.lsf


  1. HP HPC Linux Value Pack 3.1 – Platform MPI job failed

Disable SElinux in CentOS 7

1. Check the SELinux Status on CentOS 7

# sestatus
SELinux status: enabled
SELinuxfs mount: /sys/fs/selinux
SELinux root directory: /etc/selinux
Loaded policy name: targeted
Current mode: enforcing
Mode from config file: enforcing
Policy MLS status: enabled
Policy deny_unknown status: allowed
Max kernel policy version: 28

2. Disable SElinux Temporarily

# setenforce 0

2a. Check Status

# sestatus
SELinux status: enabled
SELinuxfs mount: /sys/fs/selinux
SELinux root directory: /etc/selinux
Loaded policy name: targeted
Current mode: permissive
Mode from config file: permissive
Policy MLS status: enabled
Policy deny_unknown status: allowed
Max kernel policy version: 28

3. Disable SElinux Permanently

# vim /etc/sysconfig/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
# SELINUXTYPE= can take one of three two values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.

3a. Check Status

# sestatus
SELinux status: disabled

Setting up NTP in CentOS 7

Prerequisites Step 1: Endure you are in the correct time zone

# timedatectl
      Local time: Wed 2018-09-12 13:48:31 +08
  Universal time: Wed 2018-09-12 05:48:31 UTC
        RTC time: Wed 2018-09-12 05:48:31
       Time zone: Asia/Singapore (+08, +0800)
     NTP enabled: yes
NTP synchronized: yes
 RTC in local TZ: no
      DST active: n/a

Prerequisites Step 2: List Time Zone

# timedatectl list-timezones

Prerequisites Step 3: Set Time Zone

# timedatectl set-timezone Asia/Singapore


Step 1: Yum Install

NTP can be installed from the CentOS repositories with yum

# yum install ntp

Step 2: Edit the Public Time Servers

Once you have installed ntp package, go to official NTP Public Pool Time Servers . For Singapore, you can use this specific pool zone, add the following to your ntp.conf file:

server iburst
server iburst
server iburst
server iburst

Step 3: Allow the clients from the network to sync with this server

Restrict which clients from which network is allowed to query and sync time

restrict netmask nomodify notrap

Step 4: Record all NTP server issues into one dedicated log file. Edit /etc/ntp.conf

logfile /var/log/ntp.log

Step 5: Add Firewall Rule and Start Services

# firewall-cmd --add-service=ntp --permanent
# firewall-cmd --reload
# systemctl start ntpd
# systemctl enable ntpd
# systemctl status ntpd

Step 6: Verify Time Sync

# ntpq -p
# date -R

Or query or synchronize against a selected pool of time servers

# ntpdate -q


  1. Setting Up “NTP (Network Time Protocol) Server” in RHEL/CentOS 7 (by

Set hostname using hostnamectl for CentOS 7

1. Listing hostname using “hostnamectl” or “hostnamectl status”

[root@localhost ~]# hostnamectl
Static hostname:
Icon name: computer-server
Chassis: server
Machine ID: aaaaaaaaaaaaa
Boot ID: ddddddddddd
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-327.el7.x86_64
Architecture: x86-64

2.Setting static host-name using hostnamectl

# hostnamectl set-hostname "" --static

3. Delete static host-nameusing hostnamectl

# hostnamectl set-hostname "" --static

IBM Spectrum Scale Development Blogs for (Q1 2018)

Here are list of development blogs in the this quarter (Q1 2018). As discussed in User Groups, passing it along:

GDPR Compliance and Unstructured Data Storage

IBM Spectrum Scale for Linux on IBM Z ? Release 5.0 features and highlights

Management GUI enhancements in IBM Spectrum Scale release 5.0.0

IBM Spectrum Scale 5.0.0 ? What?s new in NFS?

Benefits and implementation of Spectrum Scale sudo wrappers

IBM Spectrum Scale: Big Data and Analytics Solution Brief

Variant Sub-blocks in Spectrum Scale 5.0

Compression support in Spectrum Scale 5.0.0

IBM Spectrum Scale Versus Apache Hadoop HDFS

ESS Fault Tolerance

Genomic Workloads ? How To Get it Right From Infrastructure Point Of View.

IBM Spectrum Scale On AWS Cloud: This video explains how to deploy IBM Spectrum Scale on AWS. This solution helps the users who require highly available access to a shared name space across multiple instances with good performance, without requiring an in-depth knowledge of IBM Spectrum Scale.

Detailed Demo :
Brief Demo :

Fixing out of memory Issues in Rsync

If you are doing rsync and you encountered this error like rsync out of memory, you may want to take a look.a this article (Rsync out of memory? Try this…). Need to add an additional parameter (–no-inc-recursive) to the rsync commands.

According to the article, the the out of memory failure occured when  rsync attempts to load all the filenames and info in to RAM at startup.

# rsync -lH -rva --no-inc-recursive --progress gromacs remote_server:/usr/local