Basic Tuning of RDMA Parameters for Spectrum Scale


If your cluster has symptoms of overload and GPFS kept reporting “overloaded” in GPFS logs like the ones below, you might get long waiters and sometimes deadlocks.

Wed Apr 11 15:53:44.232 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 15:55:24.488 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 15:57:04.743 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 15:58:44.998 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 16:00:25.253 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 16:28:45.601 2018: [I] Sending 'overloaded' status to the entire cluster
Wed Apr 11 16:33:56.817 2018: [N] sdrServ: Received deadlock notification from

Increase scatterBuffersize to a Number that match IB Fabric
One of the first tuning will be to tune the scatterBufferSize. According to the wiki, FDR10 can be tuned to 131072 and FDR14 can be tuned to 262144

The default value of 32768 may perform OK. If the CPU utilization on the NSD IO servers is observed to be high and client IO performance is lower than expected, increasing the value of scatterBufferSize on the clients may improve performance.

# mmchconfig scatterBufferSize=131072

There are other parameters which can be tuned. But the scatterBufferSize worked immediately for me.
verbsRdmaSend
verbsRdmasPerConnection
verbsRdmasPerNode

Disable  verbsRdmaSend=no

# mmchconfig verbsRdmaSend=no -N nsd1,nsd2

Verify settings has taken place

# mmfsadm dump config | grep verbsRdmasPerNode

Increase verbsRdmasPerNode to 514 for NSD Nodes

# mmchonfig verbsRdmasPerNode=514 -N nsd1,nsd2

References:

  1. Best Practices RDMA Tuning

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.