GPFS Nodes being Expelled by Failed GPFS Clients

According to IBm Developer Wiki Troubleshooting Debug Expel

 

  • Disk Lease Expiration – GPFS uses a mechanism referred to as a disk lease to prevent file system data corruption by a failing node. A disk lease grants a node the right to submit IO to a file system. File system disk leases are managed by the Cluster Manager of the file system’s home cluster. A node must periodically renew it’s disk lease with the Cluster Manager to maintain it’s right to submit IO to the file system. When a node fails to renew a disk lease with the Cluster Manager, the Cluster Manager marks the node as failed, revokes the node’s right to submit IO to the file system, expels the node from the cluster, and initiates recovery processing for the failed node.
  • Node Expel Request – GPFS uses a mechanism referred to as a node expel request to prevent file system resource deadlocks. Nodes in the cluster require reliable communication amongst themselves to coordinate sharing of file system resources. If a node fails while owning a file system resource, a deadlock may ensue. If a node in the cluster detects that another node owing a shared file system resource may have failed, the node will send a message to the file system Cluster Manger requesting the failed node to be expelled from the cluster to prevent a shared file system resource deadlock. When the Cluster Manager receives a node expel request, it determines which of the two nodes should be expelled from the cluster and takes similar action as described for the Disk Lease expiration.

But in my case, I have an errant failed GPFS Client node in the network which was within the cluster. All the other legitimiate GPFS Client was trying to expel this failed node, but got expel instead. The errant one remain, while the legitimate was expelled. The only solution was to power off this errant and the entire GPFS File System became operational. Here is an except in the Log File of the NSD Nodes.

in fact, a lots of hints are found on the /var/adm/ras/mmfs.log.latest of any NSD Nodes. You should be able to locate it there.

 

Fri May 27 16:34:53.249 2016: Expel 172.16.20.5 (goldsvr1) request from 192.168.104.34 (compute186). Expelling: 192.168.104.34 (compute186)
Fri May 27 16:34:53.259 2016: Recovering nodes: 192.168.104.34
Fri May 27 16:34:53.311 2016: Recovered 1 nodes for file system gpfs3.
Fri May 27 16:34:55.636 2016: Accepted and connected to 10.0.104.34 compute186 <c0n135>
Fri May 27 16:39:13.333 2016: Expel 172.16.20.5 (goldsvr1) request from 192.168.104.45 (compute197). Expelling: 192.168.104.45 (compute197)
Fri May 27 16:39:13.334 2016: VERBS RDMA closed connection to 192.168.104.45 compute197 on mlx4_0 port 1
Fri May 27 16:39:13.344 2016: Recovering nodes: 192.168.104.45
Fri May 27 16:39:13.393 2016: Recovered 1 nodes for file system gpfs3.
Fri May 27 16:39:15.725 2016: Accepted and connected to 10.0.104.45 compute197 <c0n141>
Fri May 27 16:40:18.570 2016: VERBS RDMA accepted and connected to 192.168.104.45 on mlx4_0 port 1

Install Nvidia CUDA-7.5 environment in CentOS 6

This note is derived from How to install Nvidia CUDA environment in RHEL 6? It works for me.

Step 1: Install kernel

# yum install kernel-devel kernel-headers -y

Step 2: Download and Install Cuda Toolkit

Cuda Downloads Site

Step 3: Configure CUDA Toolkit

# echo -e "/usr/local/cuda-7.5/lib64\n/usr/local/cuda-7.5/lib" > /etc/ld.so.conf./cuda.conf
# echo "export PATH=/usr/local/cuda-7.5/bin:$PATH" > /etc/profile.d/cuda.sh
# /sbin/ldconfig

Step 4: Disable Noveau Driver

# echo -e "\nblacklist nouveau" >> /etc/modprobe.d/blacklist.conf
# dracut -f /boot/initramfs-`rpm -qa kernel --queryformat "%{PROVIDEVERSION}.%{ARCH}\n" | tail -1`.img `rpm -qa kernel --queryformat "%{PROVIDEVERSION}.%{ARCH}\n" | tail -1`

Step 5: Reboot the Server

Step 6: Check Supported Nvidia Card

[root@comp1 ~]# lspci -d "10de:*" -v
84:00.0 3D controller: NVIDIA Corporation GK110BGL [Tesla K40m] (rev a1)
Subsystem: NVIDIA Corporation Device 097e
Flags: bus master, fast devsel, latency 0, IRQ 64
Memory at c9000000 (32-bit, non-prefetchable) [size=16M]
Memory at 3c400000000 (64-bit, prefetchable) [size=16G]
Memory at 3c3fe000000 (64-bit, prefetchable) [size=32M]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Endpoint, MSI 00
Capabilities: [100] Virtual Channel
Capabilities: [128] Power Budgeting <?>
Capabilities: [420] Advanced Error Reporting
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900] #19
Kernel driver in use: nvidia
Kernel modules: nvidia, nouveau, nvidiafb

Using vmstat for profiling

Taken from RHEL Performance Tuning Student Booklet

The vmstat command, if given no arguments, will print out the averages of various system statistics since boot. The vmstat command accepts two arguments. The first is the delay, and the second is the count. The delay is a value in seconds between output. The count is the number of iterations of statistics to report.

[root@lime ~]# vmstat 4 5
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
0  0   2832 145216 194528 6772048    0    0     0     1    0    1  0  0 100  0  0
0  0   2832 145200 194528 6772048    0    0     0     1   44   45  0  0 100  0  0
0  0   2832 145208 194528 6772048    0    0     0     8   46   45  0  0 100  0  0
1  0   2832 145208 194528 6772048    0    0     0     0   44   46  0  0 100  0  0
0  0   2832 145208 194528 6772048    0    0     0     3   46   55  0  0 100  0  0
Category Statistics Definition
Process related r The number of processes waiting for runtime
Process related b The number of processes in uninterruptible sleep
memory swpd The amount of memory currently used in swap
spaces
memory free The amount of idle (immediately available)
memory
memory buff The amount of memory used as buffers
memory cache The amount of memory used as cache
swap: paging statistics si Pages of memory swapped in per second
swap: paging statistics so Pages of memory swapped out per second
io: block I/O statistics bi Blocks per second received from block devices
io: block I/O statistics bo Blocks per second sent to block devices
system in Interrupts raised per second
system cs Context switches per second
cpu: how CPU time is used us Percentage of time spent running user-space code
cpu: how CPU time is used sy Percentage of time spent running kernel code
cpu: how CPU time is used id Percentage of time spent idle
cpu: how CPU time is used wa Percentage of time spent blocked while waiting for
I/O to complete
cpu: how CPU time is used st Percentage of time where the CPU had a process
ready to run, but CPU time was stolen by the
hypervisor supporting this virtual machine
(typically because the CPU is being used by
another guest virtual machine)

 

Determine how many CPUs in the systems

I came across this sniffy forgotten tools to help determine the CPU information. I guess we are all familiar with top. But if you type “1” after top, you are able to see details in the individual

A. Using top

# top

top - 10:11:35 up 95 days, 20:02,  2 users,  load average: 0.51, 0.39, 0.20
Tasks: 326 total,   1 running, 325 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.1%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  65951272k total, 65647784k used,   303488k free,   221504k buffers
Swap: 33046520k total,        0k used, 33046520k free, 63608900k cached

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
13469 root      20   0 15172 1436  956 R  2.6  0.0   0:00.72 top
1 root      20   0 19352 1520 1208 S  0.0  0.0   0:01.58 init
2 root      20   0     0    0    0 S  0.0  0.0   0:00.01 kthreadd
..............
..............
1

top - 10:11:58 up 95 days, 20:03,  2 users,  load average: 0.46, 0.38, 0.21
Tasks: 310 total,   1 running, 309 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0%us,  0.3%sy,  0.0%ni, 99.3%id,  0.3%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.3%us,  0.0%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu8  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu9  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu10 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu11 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
.........

B. Using lspcu to take a look at summary

# lscpu

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                12
On-line CPU(s) list:   0-11
Thread(s) per core:    1
Core(s) per socket:    6
CPU socket(s):         2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 45
Stepping:              7
CPU MHz:               2501.000
BogoMIPS:              4999.27
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              15360K
NUMA node0 CPU(s):     0-5
NUMA node1 CPU(s):     6-11