IBM Spectrum Scale Development Blogs for (Q1 2018)

Here are list of development blogs in the this quarter (Q1 2018). As discussed in User Groups, passing it along:

GDPR Compliance and Unstructured Data Storage
https://developer.ibm.com/storage/2018/03/27/gdpr-compliance-unstructure-data-storage/

IBM Spectrum Scale for Linux on IBM Z ? Release 5.0 features and highlights
https://developer.ibm.com/storage/2018/03/09/ibm-spectrum-scale-linux-ibm-z-release-5-0-features-highlights/

Management GUI enhancements in IBM Spectrum Scale release 5.0.0
https://developer.ibm.com/storage/2018/01/18/gui-enhancements-in-spectrum-scale-release-5-0-0/

IBM Spectrum Scale 5.0.0 ? What?s new in NFS?
https://developer.ibm.com/storage/2018/01/18/ibm-spectrum-scale-5-0-0-whats-new-nfs/

Benefits and implementation of Spectrum Scale sudo wrappers
https://developer.ibm.com/storage/2018/01/15/benefits-implementation-spectrum-scale-sudo-wrappers/

IBM Spectrum Scale: Big Data and Analytics Solution Brief
https://developer.ibm.com/storage/2018/01/15/ibm-spectrum-scale-big-data-analytics-solution-brief/

Variant Sub-blocks in Spectrum Scale 5.0
https://developer.ibm.com/storage/2018/01/11/spectrum-scale-variant-sub-blocks/

Compression support in Spectrum Scale 5.0.0
https://developer.ibm.com/storage/2018/01/11/compression-support-spectrum-scale-5-0-0/

IBM Spectrum Scale Versus Apache Hadoop HDFS
https://developer.ibm.com/storage/2018/01/10/spectrumscale_vs_hdfs/

ESS Fault Tolerance
https://developer.ibm.com/storage/2018/01/09/ess-fault-tolerance/

Genomic Workloads ? How To Get it Right From Infrastructure Point Of View.
https://developer.ibm.com/storage/2018/01/06/genomic-workloads-get-right-infrastructure-point-view/

IBM Spectrum Scale On AWS Cloud: This video explains how to deploy IBM Spectrum Scale on AWS. This solution helps the users who require highly available access to a shared name space across multiple instances with good performance, without requiring an in-depth knowledge of IBM Spectrum Scale.

Detailed Demo : https://www.youtube.com/watch?v=6j5Xj_d0bh4
Brief Demo : https://www.youtube.com/watch?v=-aMQKPW_RfY

Removing Existing NSD Nodes from GPFS Cluster

Removing existing NSD Nodes from the GPFS Cluster is not difficult, but there are several steps to take note of.

Step 1: Check and make sure the NSD Nodes you are removing are not quorum-manager.

See Removing Quorum Manager from NSD Nodes

Step 2: Check that Quorum-Manager have been removed from the old NSD

# mmlscluster

GPFS cluster information
========================
GPFS cluster name:         mygpfs.gpfsnsd1
GPFS cluster id:           720691660936079521
GPFS UID domain:           mygpfs.gpfsnsd1
Remote shell command:      /usr/bin/ssh
Remote file copy command:  /usr/bin/scp

GPFS cluster configuration servers:
-----------------------------------
Primary server:    newnsd3
Secondary server:  newnsd4
.....
.....
.....
.....
714   oldnsd1          192.168.111.5   oldnsd1
715   oldnsd2          192.168.111.6   oldnsd2
716   newnsd3          192.168.111.7   newnsd3         quorum-manager
717   newnsd4          192.168.111.8   newnsd4         quorum-manager

Step 3a: Umount the GPFS File System

# mmumount all -a

Step 3b: Check all the GPFS nodes have been unmounted

# mmlsmount all

File system gpfs0 is mounted on 0 nodes.

Step 4: Displays Network Shared Disk (NSD) information for the GPFS cluster.

# mmlsnsd

File system   Disk name    NSD servers
---------------------------------------------------------------------------
gpfs0         dcs3700A_2   File system   Disk name    NSD servers
---------------------------------------------------------------------------
gpfs0         dcs3700A_2   newnsd3,oldnsd1
gpfs0         dcs3700A_3   newnsd3,oldnsd1
gpfs0         dcs3700A_4   newnsd3,oldnsd1
gpfs0         dcs3700A_5   newnsd3,oldnsd1
gpfs0         dcs3700A_6   newnsd3,oldnsd1
gpfs0         dcs3700A_7   newnsd3,oldnsd1
gpfs0         dcs3700B_2   newnsd4,oldnsd2
gpfs0         dcs3700B_3   newnsd4,oldnsd2
gpfs0         dcs3700B_4   newnsd4,oldnsd2
gpfs0         dcs3700B_5   newnsd4,oldnsd2
gpfs0         dcs3700B_6   newnsd4,oldnsd2
gpfs0         dcs3700B_7   newnsd4,oldnsd2

Step 5: Changes Network Shared Disk (NSD) configuration attributes.

# mmchnsd dcs3700A_2:newnsd3

To confirm the changes, issue this command:

# mmlsnsd -d dcs3700A_2
File system   Disk name    NSD servers
---------------------------------------------------------------------------
 gpfs0         dcs3700A_2   newnsd3

Do for the rest…..

# mmchnsd dcs3700A_3:newnsd3
# mmlsnsd -d dcs3700A_3
# mmchnsd dcs3700A_4:newnsd3
# mmlsnsd -d dcs3700A_4
# mmchnsd dcs3700A_5:newnsd3
# mmlsnsd -d dcs3700A_5
# mmchnsd dcs3700A_6:newnsd3
# mmlsnsd -d dcs3700A_6
# mmchnsd dcs3700A_7:newnsd3
# mmlsnsd -d dcs3700A_7
# mmchnsd dcs3700B_2:newnsd4
# mmlsnsd -d dcs3700B_2
# mmchnsd dcs3700B_3:newnsd4
# mmlsnsd -d dcs3700B_3
# mmchnsd dcs3700B_4:newnsd4
# mmlsnsd -d dcs3700B_4
# mmchnsd dcs3700B_5:newnsd4
# mmlsnsd -d dcs3700B_5
# mmchnsd dcs3700B_6:newnsd4
# mmlsnsd -d dcs3700B_6
# mmchnsd dcs3700B_7:newnsd4
# mmlsnsd -d dcs3700B_7

Step 6: Remove Old NSD Nodes

Once you have confirmed that the NSD nodes are ok.

# mmdelnode -N oldnsd1,oldnsd2

Verify that the old NSD nodes are no more in the cluster

# mmlscluster

Step 7: Remount the File System

# mmmount all -a

References:

  1. General Parallel File System

GPFS Nodes being Expelled by Failed GPFS Clients

According to IBm Developer Wiki Troubleshooting Debug Expel

 

  • Disk Lease Expiration – GPFS uses a mechanism referred to as a disk lease to prevent file system data corruption by a failing node. A disk lease grants a node the right to submit IO to a file system. File system disk leases are managed by the Cluster Manager of the file system’s home cluster. A node must periodically renew it’s disk lease with the Cluster Manager to maintain it’s right to submit IO to the file system. When a node fails to renew a disk lease with the Cluster Manager, the Cluster Manager marks the node as failed, revokes the node’s right to submit IO to the file system, expels the node from the cluster, and initiates recovery processing for the failed node.
  • Node Expel Request – GPFS uses a mechanism referred to as a node expel request to prevent file system resource deadlocks. Nodes in the cluster require reliable communication amongst themselves to coordinate sharing of file system resources. If a node fails while owning a file system resource, a deadlock may ensue. If a node in the cluster detects that another node owing a shared file system resource may have failed, the node will send a message to the file system Cluster Manger requesting the failed node to be expelled from the cluster to prevent a shared file system resource deadlock. When the Cluster Manager receives a node expel request, it determines which of the two nodes should be expelled from the cluster and takes similar action as described for the Disk Lease expiration.

But in my case, I have an errant failed GPFS Client node in the network which was within the cluster. All the other legitimiate GPFS Client was trying to expel this failed node, but got expel instead. The errant one remain, while the legitimate was expelled. The only solution was to power off this errant and the entire GPFS File System became operational. Here is an except in the Log File of the NSD Nodes.

in fact, a lots of hints are found on the /var/adm/ras/mmfs.log.latest of any NSD Nodes. You should be able to locate it there.

 

Fri May 27 16:34:53.249 2016: Expel 172.16.20.5 (goldsvr1) request from 192.168.104.34 (compute186). Expelling: 192.168.104.34 (compute186)
Fri May 27 16:34:53.259 2016: Recovering nodes: 192.168.104.34
Fri May 27 16:34:53.311 2016: Recovered 1 nodes for file system gpfs3.
Fri May 27 16:34:55.636 2016: Accepted and connected to 10.0.104.34 compute186 <c0n135>
Fri May 27 16:39:13.333 2016: Expel 172.16.20.5 (goldsvr1) request from 192.168.104.45 (compute197). Expelling: 192.168.104.45 (compute197)
Fri May 27 16:39:13.334 2016: VERBS RDMA closed connection to 192.168.104.45 compute197 on mlx4_0 port 1
Fri May 27 16:39:13.344 2016: Recovering nodes: 192.168.104.45
Fri May 27 16:39:13.393 2016: Recovered 1 nodes for file system gpfs3.
Fri May 27 16:39:15.725 2016: Accepted and connected to 10.0.104.45 compute197 <c0n141>
Fri May 27 16:40:18.570 2016: VERBS RDMA accepted and connected to 192.168.104.45 on mlx4_0 port 1

GPFS unable to mount with no Errors Symptoms

When a GPFS client machine rebooted, the GPFS File System was unmounted with no error sign. When NSD issues the command “mmstartup -N client_node“, similarly, there is no error sign.

But if you do a

# mmdsh -v -N all "/usr/lpp/mmfs/bin/mmfsadm dump waiters" > all.waiters

You may see something like

.......Sync handler: on ThCond 0x1015236D30 (0xFFFFC20015236D30) (wait for inodeFlushFlag), reason 'waiting for the flush flag'......

This occurs when a revoke comes in and an mmapped file needs to be flushed to disk. GPFS tells Linux to flush all dirty mapped pages, and the thread then waits for Linux to report that this has been completed. So something in the kernel is preventing all the dirty pages from being flushed. I guess the best way is to use the NSD nodes to issue a command to do a

# mmshutdown -N client_node
# mmstartup -N client_node

Refereces:

  1. GPFS will not mount (but shows no errors)