Node cannot be added to the GPFS cluster

At the NSD Node, I issue the command

# mmaddnode -N node1
Thu Jul 23 13:40:12 SGT 2015: mmaddnode: Processing node node1
mmaddnode: Node node1 was not added to the cluster.
The node appears to already belong to a GPFS cluster.
mmaddnode: mmaddnode quitting.  None of the specified nodes are valid.
mmaddnode: Command failed.  Examine previous error messages to determine cause.

If we do a mmcluster, the node is not around in the cluster

# mmcluster |grep node1

If the node is not in the cluster, issue this command on the client node that could not be added:

# mmdelnode -f
mmdelnode: All GPFS configuration files on node goldsvr1 have been removed.

Reissue the mmaddnode command.

References:

  1. Node cannot be added to the GPFS cluster

GPFS unable to mount with no Errors Symptoms

When a GPFS client machine rebooted, the GPFS File System was unmounted with no error sign. When NSD issues the command “mmstartup -N client_node“, similarly, there is no error sign.

But if you do a

# mmdsh -v -N all "/usr/lpp/mmfs/bin/mmfsadm dump waiters" > all.waiters

You may see something like

.......Sync handler: on ThCond 0x1015236D30 (0xFFFFC20015236D30) (wait for inodeFlushFlag), reason 'waiting for the flush flag'......

This occurs when a revoke comes in and an mmapped file needs to be flushed to disk. GPFS tells Linux to flush all dirty mapped pages, and the thread then waits for Linux to report that this has been completed. So something in the kernel is preventing all the dirty pages from being flushed. I guess the best way is to use the NSD nodes to issue a command to do a

# mmshutdown -N client_node
# mmstartup -N client_node

Refereces:

  1. GPFS will not mount (but shows no errors)

Quick and Dirty way to ensure only one instance of a shell script is running at a time

This script is taken from Quick and Dirty way to ensure only one instance of a shell script is running at a time

For example, this can be used to ensure SAGE Mirroring rsync run once.

# rsyncs from sage.math.washington.edu using its rsync daemon
# for automated use, remove the "vv" and "progress" switches

# locking mechanism from
# http://stackoverflow.com/questions/185451/quick-and-dirty-way-to-ensure-only-one-instance-of-a-shell-script-is-running-at-a

LOCKFILE=./rsync_sagemath.lock

if [ -e ${LOCKFILE} ] && kill -0 `cat ${LOCKFILE}`; then
echo "rsync_sagemath already running ... exit"
exit
fi

# make sure the lockfile is removed when we exit and then claim it
trap "rm -f ${LOCKFILE}; exit" INT TERM EXIT
echo $$ > ${LOCKFILE}

rsync -av --delete-after rsync.sagemath.org::sage /var/www/html/sage

rm -f ${LOCKFILE}