Topology Scheduling on Platform LSF

For a highly parallel job that span across multiple hosts, it is desirable to allocate hosts to the job that are close together according to network topology. The purpose is to minimize communication latency.

The article is taken from IBM Platform LSF Wiki “Using Compute units for Topology Scheduling

Step 1: Define COMPUTE_UNIT_TYPES in lsb.params

COMPUTE_UNIT_TYPES = enclosure! switch rack
  1. The example specifies 3 CU Types. In this parameter, the order of the values corresponds to levels in the network topology. CU Type enclosure are contained in CU Type switch; CU Type rack
  2. The exclamation mark (!) following switch means that this is the default level to be used for jobs with CU topology requirements.  If the exclamation mark is omitted, the first string listed is the default type.

Step 2:  Arrange hosts into lsb.hosts

Begin ComputeUnit
NAME    TYPE            CONDENSE        MEMBER
en1-1   enclosure        Y                   (c00 c01 c02)
en1-2   enclosure        Y                   (c03 c04 c05)
en1-3   enclosure        Y                   (c06 c07 co8 c09 c10)
.....
s1      switch           Y                   (en1-1 en1-2)
s2      switch           Y                   (en1-3)
.....
r1      rack             Y                   (s1 s2)
.....
End ComputeUnit

Update the mbatchd by doing a

# badmin reconfig

View the CU Configuration

# bmgroup -cu

Step 3: Using bhosts to display information

Since you are using “Y” under the CONDENSE Column in lsb.params, the bhosts display the CU type. But if you do a bhosts -X, you will see all the nodes.

References:

  1. Using Compute Units for Topology Scheduling

Basic Configuration for Platform Application Centre 10.1

You have to install Platform LSF 10.1 first. Please read Basic Configuration of Platform LSF 10.1

Step 1: Unpack the Platform Appliction Centre

# tar -zxvf pac10.1_standard_linux-x64.tar.Z
# cd pac10.1_standard_linux-x64

Please go to the installation directory, go to $LSF_INSTALL/lsfshpc10.1-x86_64/pac/pac10.1_standard_linux-x64

Step 2: Yum install the mysql

# yum install mysql mysql-server mysql-connector-java

Step 2a: Configure the How to Install MySQL on CentOS 6

Step 3: Edit the pacinstall.sh

export MYSQL_JDBC_DRIVER_JAR="/usr/share/java/mysql-connector-java-5.1.17.jar" (Line 84)

Step 4: Complete the installation

Step 4a: Enable perfmom in your LSF Cluster
Optional. Enable perfmon in your LSF cluster to see the System Services Dashboard in IBM Spectrum LSF Application Center.

# badmin perfmon start
# badmin perfmon view

Step 4b: Set the IBM Spectrum LSF Application Center environment

# cp /opt/pac/profile.platform /etc/profile.d/pac1.sh
# source /etc/profile.d/pac1.sh

Step 4c: Start IBM Spectrum LSF Application Center services.

# perfadmin start all
# pmcadmin start

Step 4d: Check services have started.

# perfadmin list
# pmcadmin list

You can see the WEBGUI, jobdt, plc, purger, and PNC services started.

Step 5: Log in to IBM Spectrum LSF Application Center.

Browse to the web server URL and log in to the IBM Spectrum LSF Application Center with the IBM Spectrum LSF administrator name and password.

Step 5a Importing the cacert.pem certificate into the client browser

Step 6: Platform URL
When HTTPS is enabled, the web server URL is: https://host_name:8443/platform

 

References:

  1. IBM Spectrum LSF Application Center V10.1 documentation

Basic Configuration of Platform LSF 10.1

Step 1: Prelimary Steps (Suggestion)

  1. Setup a NFS Shared Directory for the final installed destination of the setup (/opt/lsf)
  2. Use a NFS Shared Directory perhaps /usr/local to put the tar file so that the installation file can be placed in the future for client nodes (/usr/local/lsf_install)
  3. Make sure your /etc/hosts are configured correctly and selinux disabled

Step 2: Untar the LSF Tar file (lsfshpc10.1-x86_64.tar.gz).

# tar -zxvf lsfshpc10.1-x86_64.tar.gz

You will have a folder called lsfshpc10.1-x86_64.

Step 3: Navigate to lsfshpc10.1-x86_64/lsf.
You should have 2 following files

lsf10.1_linux2.6-glibc2.3-x86_64.tar.Z (LSF Distribution Package)
lsf10.1_lsfinstall_linux_x86_64.tar.Z (LSF Installation File)

Step 4: Unpack the LSF Installation File

# tar -zxvf lsf10.1_lsfinstall_linux_x86_64.tar.Z

Step 5: Edit the Install.

# vim /usr/local/lsf_install/lsfshpc10.1-x86_64/lsf/lsf10.1_lsfinstall/install.config

Critical “Field”. Suggested

LSF_TOP="/opt/lsf" (line 43)
LSF_ADMINS="lsfadmin admin" (line 53)
LSF_CLUSTER_NAME="mycluster" (line 70)
LSF_MASTER_LIST="h00" (line 85)
LSF_TARDIR="/opt/lsf/lsf_distrib/" (line 95 - where you have placed the distribution)
CONFIGURATION_TEMPLATE="PARALLEL" (line 106)
LSF_ADD_CLIENTS="h00 c00" (line 165)
LSF_QUIET_INST="N" (line 193)
ENABLE_EGO="N" (line 290)

Step 6: Install using lsfinstall

# /usr/local/lsf_install/lsfshpc10.1-x86_64/lsf/lsf10.1_lsfinstall/lsfinstall -f install.config

Step 7: Follow the instruction and agree on the terms and conditions

Step 8: Create a file and Source the profile.lsf

# touch /etc/profile.d/lsf.sh

Inside the lsf.sh, put in the following line

source /opt/lsf/conf/profile.lsf

Step 9: Create the user lsfadmin

# useradd -d /home/lsfadmin -g users -m lsfadmin

Step 10: Client Host setup

Copy /etc/profile.d/lsf.sh to the client’s /etc/profile.d/lsf.sh

# scp /etc/profile.d/lsf.sh remote_node:/etc/profile.d/

Do host-setup

# cd /usr/local/lsf_install/lsfshpc10.1-x86_64/lsf/lsf10.1_lsfinstall/
# ./hostsetup --top="/opt/lsf" --boot="y"

Step 11: Restart the LSF services on the clients

# service lsf restart

Step 12: Restart the service on the headnode.

# lsadmin reconfig
# badmin mbdrestart

Step 13: Test the cluster with basic LSF Commands.
run the lsid, lshosts, and bhosts commands and see whether there are outputs.

 

References: 

  1. Installing IBM Platform LSF on UNIX and Linux
  2. Common LSF problems

 

Adding non-root users to administer Platform LSF

Step 1.    On the master host, change directory to $LSF_ENVDIR

# cd $LSF_ENVDIR

Step 2.   Modify the “lsf.cluster.<clustername>” file:

# vim lsf.cluster.<clustername>

Step 3.    Edit the following section in the file to add your non-root user as LSF administrator, e.g. adding user1 as administrator:

Begin   ClusterAdmins
Administrators = phpcadmin user1
End    ClusterAdmins

Note: By default, “phpcadmin” is the administrator for Platform HPC at time of installation. Do not remove it.

4.    Execute the following command on the master host for new changes to take effect. You need to perform it as “root” user or “phpcadmin”:

# lsadmin reconfig
# badmin mbdrestart

LSF retained the original Max Locked Memory and not the updated one

The value of “max locked memory” has been modified at the operating system level, but LSF still returns the original value.

Symptoms before updating max locked memory

[user1@cluster-h00 ~]$ bsub -q myQueue -W 120:00 -n 16 -P myProjectGroup -m compute-node1 -I ulimit -a
Job <32400> is submitted to default queue <normal>.
<<Waiting for dispatch ...>>
<<Starting on compute-node1>>
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 1027790
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 4096
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1027790
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

To resolve this issue,

# badmin hshutdown
# badmin hstartup
[user1@cluster-h00 ~]$ bsub -q gpgpu -m compute-node1 -I ulimit -a
Job <32490> is submitted to queue <gpgpu>.
<<Waiting for dispatch ...>>
<<Starting on compute-node1>>
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 515133
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 4096
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 515133
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

References:

  1. LSF does not recognize that “max locked memory” has been updated

Platform LSF – Controlling Hosts

1. Closing a Host

# badmin hclose hostid
Close hostid ...... done

2. Opening a Host

# badmin hopen hostid
Open hostid ...... done

3. Log a comment when closing or opening a host

# badmin hopen -C "Re-Provisioned" hostA
# badmin hclose -C "Weekly backup" hostB

The comment text Weekly backup is recorded in lsb.events. If you close or open a host group, each host group member displays with the same comment string.

Platform LSF – Working with Hosts (lshosts, lsmon)

The lshosts command shows the load thresholds. Using lshosts -l

$ lshosts -l
HOST_NAME:  comp001
type             model  cpuf ncpus ndisks maxmem maxswp maxtmp rexpri server nprocs ncores nthreads
X86_64     Intel_EM64T  60.0    16      1    63G    16G 352423M      0    Yes      2      8        1

RESOURCES: Not defined
RUN_WINDOWS:  (always open)

LOAD_THRESHOLDS:
r15s   r1m  r15m   ut    pg    io   ls   it   tmp   swp   mem   root maxroot processes clockskew netcard iptotal  cpuhz cachesize diskvolume processesroot   ipmi powerconsumption ambienttemp cputemp
-   3.5     -    -     -     -    -    -     -     -     -      -       -         -         -       -       -      -         -          -             -      -                -           -       -

Platform LSF – Working with Hosts (bhost, lsload, lsmon)

Host status

Host status describes the ability of a host to accept and run batch jobs in terms of daemon states, load levels, and administrative controls. The bhosts and lsload commands display host status.

 

1. bhosts
Displays the current status of the host

STATUS DESCRIPTION
ok  Host is available to accept and run new batch jobs
unavail  Host is down, or LIM and sbatchd are unreachable.
unreach  LIM is running but sbatchd is unreachable.
closed  Host will not accept new jobs. Use bhosts -l to display the reasons.
unlicensed Host does not have a valid license.

 

2. bhosts -l
Displays the closed reasons. A closed host does not accept new batch jobs:

$ bhosts -l
HOST  node001
STATUS           CPUF  JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV DISPATCH_WINDOW
closed_Adm      60.00     -     16      0      0      0      0      0      -

CURRENT LOAD USED FOR SCHEDULING:
r15s   r1m  r15m    ut    pg    io   ls    it   tmp   swp   mem   root maxroot
Total           0.0   0.0   0.0    0%   0.0     0    0 28656  324G   16G   60G  3e+05   4e+05
Reserved        0.0   0.0   0.0    0%   0.0     0    0     0    0M    0M    0M    0.0     0.0

processes clockskew netcard iptotal  cpuhz cachesize diskvolume
Total             404.0       0.0     2.0     2.0 1200.0     2e+04      5e+05
Reserved            0.0       0.0     0.0     0.0    0.0       0.0        0.0

processesroot   ipmi powerconsumption ambienttemp cputemp
Total                 396.0   -1.0             -1.0        -1.0    -1.0
Reserved                0.0    0.0              0.0         0.0     0.0


aa_r aa_r_dy aa_dy_p aa_r_ad aa_r_hpc fluentall fluent fluent_nox
Total         17.0    25.0   128.0    10.0    272.0      48.0   48.0       50.0
Reserved       0.0     0.0     0.0     0.0      0.0       0.0    0.0        0.0

gambit geom_trans tgrid fluent_par
Total           50.0       50.0  50.0      193.0
Reserved         0.0        0.0   0.0        0.0

 

3. bhosts -X

Condensed host groups in an condensed format

$ bhosts -X
HOST_NAME          STATUS       JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV
comp027            ok              -     16      0      0      0      0      0
comp028            ok              -     16      0      0      0      0      0
comp029            ok              -     16      0      0      0      0      0
comp030            ok              -     16      0      0      0      0      0
comp031            ok              -     16      0      0      0      0      0
comp032            ok              -     16      0      0      0      0      0
comp033            ok              -     16      0      0      0      0      0

 

4. bhosts -l hostID

Display all information about specific server host such as the CPU factor and the load thresholds to start, suspend, and resume jobs

# bhosts -l comp067
HOST  comp067
STATUS           CPUF  JL/U    MAX  NJOBS    RUN  SSUSP  USUSP    RSV DISPATCH_WINDOW
ok              60.00     -     16      0      0      0      0      0      -

CURRENT LOAD USED FOR SCHEDULING:
r15s   r1m  r15m    ut    pg    io   ls    it   tmp   swp   mem   root maxroot
Total           0.0   0.0   0.0    0%   0.0     0    0 13032  324G   16G   60G  3e+05   4e+05
Reserved        0.0   0.0   0.0    0%   0.0     0    0     0    0M    0M    0M    0.0     0.0

processes clockskew netcard iptotal  cpuhz cachesize diskvolume
Total             406.0       0.0     2.0     2.0 1200.0     2e+04      5e+05
Reserved            0.0       0.0     0.0     0.0    0.0       0.0        0.0

processesroot   ipmi powerconsumption ambienttemp cputemp
Total                 399.0   -1.0             -1.0        -1.0    -1.0
Reserved                0.0    0.0              0.0         0.0     0.0

aa_r aa_r_dy aa_dy_p aa_r_ad aa_r_hpc fluentall fluent fluent_nox
Total         18.0    25.0   128.0    10.0    272.0      47.0   47.0       50.0
Reserved       0.0     0.0     0.0     0.0      0.0       0.0    0.0        0.0

gambit geom_trans tgrid fluent_par
Total           50.0       50.0  50.0      193.0
Reserved         0.0        0.0   0.0        0.0

LOAD THRESHOLD USED FOR SCHEDULING:
r15s   r1m  r15m   ut      pg    io   ls    it    tmp    swp    mem
loadSched   -     -     -     -       -     -    -     -     -      -      -
loadStop    -     -     -     -       -     -    -     -     -      -      -

root maxroot processes clockskew netcard iptotal   cpuhz cachesize
loadSched     -       -         -         -       -       -       -         -
loadStop      -       -         -         -       -       -       -         -

diskvolume processesroot    ipmi powerconsumption ambienttemp cputemp
loadSched        -             -       -                -           -       -
loadStop         -             -       -                -           -       -

 

5. lsload

[user1@login1 ~]$ lsload
HOST_NAME       status  r15s   r1m  r15m   ut    pg  ls    it   tmp   swp   mem
login1          ok   0.0   0.0   0.0   1%   0.0  17     0  240G   16G   28G
login2          ok   0.0   0.0   0.0   0%   0.0   0  7040  242G   16G   28G
node1           ok   0.0   0.4   0.3   0%   0.0   0 31760  324G   16G   60G

Displays the current state of the host:

STATUS DESCRIPTION
ok Host is available to accept and run batch jobs and remote tasks.
-ok LIM is running but RES is unreachable.
busy Does not affect batch jobs, only used for remote task placement (i.e., lsrun). The value of a load index exceeded a threshold (configured in lsf.cluster.cluster_name, displayed by lshosts -l). Indices that exceed thresholds are identified with an asterisk (*).
lockW Does not affect batch jobs, only used for remote task placement (i.e., lsrun). Host is locked by a run window (configured in lsf.cluster.cluster_name, displayed by lshosts -l).
lockU Will not accept new batch jobs or remote tasks. An LSF administrator or root explicitly locked the host using lsadmin limlock, or an exclusive batch job (bsub -x) is running on the host. Running jobs are not affected. Use lsadmin limunlock to unlock LIM on the local host.
unavail Host is down, or LIM is unavailable.

 

6. lshosts -l
The lshosts command shows the load thresholds.

$ lshosts -l
HOST_NAME:  comp001
type             model  cpuf ncpus ndisks maxmem maxswp maxtmp rexpri server nprocs ncores nthreads
X86_64     Intel_EM64T  60.0    16      1    63G    16G 352423M      0    Yes      2      8        1

RESOURCES: Not defined
RUN_WINDOWS:  (always open)

LOAD_THRESHOLDS:
r15s   r1m  r15m   ut    pg    io   ls   it   tmp   swp   mem   root maxroot processes clockskew netcard iptotal  cpuhz cachesize diskvolume processesroot   ipmi powerconsumption ambienttemp cputemp
-   3.5     -    -     -     -    -    -     -     -     -      -       -         -         -       -       -      -         -          -             -      -                -           -       -

 

7. References:

  1. Platform – Working with hosts

Tracking Batch Jobs at Platform LSF

The content article is taken from http://users.cs.fiu.edu/~tho01/psg/3rdParty/lsf4_userGuide/07-tracking.html

1. Displaying All Job Status

# bjobs -u all

2. Report Reasons why a job is pending

# bjobs -p

3. Report Pending Reasons with host names for each conditions

# bjobs -lp

4. Detailed Report on a specific jobs

# bjobs -l 6653

5. Reasons why the job is suspended

# bjobs -s

6. Displaying Job History

# bpeek 12345

7. Killing Jobs

# bkill 12345

8. Stop the Job

# bstop 12345

9 Resume the job

# bresume 12345