Supported Platform
Year: 2018
Disable FirewallD Services on CentOS 7
Do note that firewall on CentOS 7 system is enabled by default.
Step 1: To check the status of CentOS 7 FirewallD
# systemctl status firewalld.service
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:firewalld(1)
The above shows that the firewalld is disabled.
Step 2: To stop the FirewallD
# systemctl stop firewalld.service
Step 3: To completely disable the firewalld service
# systemctl disable firewalld.service
Error while loading shared libraries: libXm.so.4 on CentOS 7
If you are installing something and you see this “Error error while loading shared libraries: libXm.so.4”. It is quite easy to solve. Just do the following
# yum install motif, motif-devel
Resolving “lsb_release not found” on CentOS 7
I was installing ABAQUS 2017 on CentOS 7 when I encountered an error. lsb_release is the print distribution specific information. Strangely, this issue is found on CentOS 7 distribution.
[root@node-h001 1]# ./StartGUI.sh CurrentMediaDir initial="." CurrentMediaDir="/root/abaqus2017/AM_SIM_Abaqus_Extend.AllOS/1" Current operating system: "Linux" ./StartGUI.sh[21]: .[31]: .: line 3: lsb_release: not found DSY_OS_Release="" Unknown linux release "" exit 8
Resolving Issues
# yum install redhat-lsb-core
Verification
[root@node-h001 1]# lsb_release LSB Version: :core-4.1-amd64:core-4.1-noarch
Pre-check before restarting the NSD Nodes
Before restarting the NSD Nodes or Quorum Manager Nodes or other critical nodes, do check the following first to ensure the file system is in the right order before restarting.
1. Make sure all three quorum nodes are active.
# mmgetstate -N quorumnodes
*If any machine is not active, do *not* proceed
2. Make sure file system is mounted on machines
# mmlsmount gpfs0
If the file system is not mounted somewhere, we should try to resolve it first.
Spectrum Scale User Group @ London (April)
There was a good and varied topics being discussed at the Spectrum Scale
- Opening & welcome – Simon Thompson, Claire O’Toole, Ted Hoover
- Update Scale (video)/ ESS / Support (video) – Mathias Dietz & Chris Maestas
- MultiCloud Transparent Cloud Tiering (video) – Rob Basham
- Shared NVMe for High Performance Spectrum Scale Clusters (video)- Stuart Campbell
- User Talk – EBI MMAP issues (video – both speakers) – Jordi Valls / Sven Oehme
- GxFS Storage Appliance at Karlsruher Institute of Technology (video) – Jan Erik Sundermann
- Tooling Scale – Automation
- R&S VSA (Virtual storage access) Reliable fault tolerant storage in Broadcast (video) – Oliver Gappa
- Novel TCT: A brief demo on using TCT with alternative cloud gateways – Laurence Horrocks-Barlow
- Ten commandments of good I/O – Rosemary Francis
- Scientific Computing & Storage at The Francis Crick Institute – Michael Holliday
- Mixing storage systems in Spectrum Scale – Migrations & pools stories – Luis Bolinches
- File System Audit Logging / Running Spectrum Scale in a Vagrant environment – Chris Maestas
- AFM Deep Dive – Tuning and debugging – Venkateswara Puvvada
- User Talk – QMUL – Peter Childs
- Sponsor Talk – Lenovo – Michael Hennecke
- User Talk – MAX IV – Andreas Mattsson
- Sponsor Talk – DDN – Vic Cornell
- Cognititive, ML, Hortonworks – Yong ZY Zheng
Basic Tuning of RDMA Parameters for Spectrum Scale
If your cluster has symptoms of overload and GPFS kept reporting “overloaded” in GPFS logs like the ones below, you might get long waiters and sometimes deadlocks.
Wed Apr 11 15:53:44.232 2018: [I] Sending 'overloaded' status to the entire cluster Wed Apr 11 15:55:24.488 2018: [I] Sending 'overloaded' status to the entire cluster Wed Apr 11 15:57:04.743 2018: [I] Sending 'overloaded' status to the entire cluster Wed Apr 11 15:58:44.998 2018: [I] Sending 'overloaded' status to the entire cluster Wed Apr 11 16:00:25.253 2018: [I] Sending 'overloaded' status to the entire cluster Wed Apr 11 16:28:45.601 2018: [I] Sending 'overloaded' status to the entire cluster Wed Apr 11 16:33:56.817 2018: [N] sdrServ: Received deadlock notification from
Increase scatterBuffersize to a Number that match IB Fabric
One of the first tuning will be to tune the scatterBufferSize. According to the wiki, FDR10 can be tuned to 131072 and FDR14 can be tuned to 262144
The default value of 32768 may perform OK. If the CPU utilization on the NSD IO servers is observed to be high and client IO performance is lower than expected, increasing the value of scatterBufferSize on the clients may improve performance.
# mmchconfig scatterBufferSize=131072
There are other parameters which can be tuned. But the scatterBufferSize worked immediately for me.
verbsRdmaSend
verbsRdmasPerConnection
verbsRdmasPerNode
Disable verbsRdmaSend=no
# mmchconfig verbsRdmaSend=no -N nsd1,nsd2
Verify settings has taken place
# mmfsadm dump config | grep verbsRdmasPerNode
Increase verbsRdmasPerNode to 514 for NSD Nodes
# mmchonfig verbsRdmasPerNode=514 -N nsd1,nsd2
References:
Cannot initialize RDMA protocol on Cluster with Platform LSF
If you encounter this issue during an application run and your scheduler used is Platform LSF. There is a simple solution.
Symptoms
explicit_dp: Rank 0:13: MPI_Init_thread: didn't find active interface/port explicit_dp: Rank 0:13: MPI_Init_thread: Can't initialize RDMA device explicit_dp: Rank 0:13: MPI_Init_thread: Internal Error: Cannot initialize RDMA protocol MPI Application rank 13 exited before MPI_Init() with status 1 mpirun: Broken pipe
Cause:
In this case the amount of locked memory was set to unlimited in /etc/security/limits.conf, but this was not sufficient.
The MPI jobs were started under LSF, but the lsf daemons were started with very small memory locked limits.
Solution:
Set the amount of locked memory to unlimited in /etc/init.d/lsf by adding the ‘ulimit -l unlimited’ command.
..... ..... ### END INIT INFO ulimit -l unlimited . /opt/lsf/conf/profile.lsf ..... .....
References:
Disable SElinux in CentOS 7
1. Check the SELinux Status on CentOS 7
# sestatus
SELinux status: enabled SELinuxfs mount: /sys/fs/selinux SELinux root directory: /etc/selinux Loaded policy name: targeted Current mode: enforcing Mode from config file: enforcing Policy MLS status: enabled Policy deny_unknown status: allowed Max kernel policy version: 28
2. Disable SElinux Temporarily
# setenforce 0
2a. Check Status
# sestatus
SELinux status: enabled SELinuxfs mount: /sys/fs/selinux SELinux root directory: /etc/selinux Loaded policy name: targeted Current mode: permissive Mode from config file: permissive Policy MLS status: enabled Policy deny_unknown status: allowed Max kernel policy version: 28
3. Disable SElinux Permanently
# vim /etc/sysconfig/selinux/config
# This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - No SELinux policy is loaded. SELINUX=disabled # SELINUXTYPE= can take one of three two values: # targeted - Targeted processes are protected, # minimum - Modification of targeted policy. Only selected processes are protected. # mls - Multi Level Security protection. SELINUXTYPE=targeted
3a. Check Status
# sestatus
SELinux status: disabled
