Installing Torque 2.5 on CentOS 6

A. Configuring for TORQUE Server

Step 1: Download the Torque Software from Adaptive Computing

# wget TORQUE Downloads

Step 2: Configure the Torque Server

./configure \
--prefix=/opt/torque \
--exec-prefix=/opt/torque/x86_64 \
--enable-docs \
--disable-gui \
--with-server-home=/var/spool/torque \
--enable-syslog \
--with-scp \
--disable-rpp \
--disable-spool \
--with-pam

Step 3: Compile Torque

# make
# make install

Step 4: Make packages for the clients

# make packages

You should have the following

torque-package-doc-linux-x86_64.sh
torque.setup
torque-package-clients-linux-x86_64.sh
torque-package-mom-linux-x86_64.sh
torque_setup.sh
torque-package-devel-linux-x86_64.sh
torque-package-server-linux-x86_64.sh
torque-package-pam-linux-x86_64.sh  
torque.spec

Step 5: Installing Torque as a service (pbs_mom)

I was unable to use the default “init.d” script found at $TORQUE/contrib/init.d to run as a service. But a workaround it to use the open-source XCAT which has a working pbs_mom /opt/xcat/share/xcat/netboot/add-on/torque/pbs_mom. To install the latest xcat, you may want to read the blog entry Dependency issues when installing xCAT 2.7 on CentOS 6

Assuming you have successful install xCAT, copy the pbs_mom script to /etc/init.d/pbs_mom

# cp /opt/xcat/share/xcat/netboot/add-on/torque/pbs_mom /etc/init.d/pbs_mom

Step 5a: Edit the /etc/init.d/pbs_mom and restart the service

# vim /etc/init.d/pbs_mom

Inside pbs_mom script

BASE_PBS_PREFIX=/opt/torque

#ulimit -n 20000
#ulimit -i 20000
ulimit -l unlimited

Save and exit.

At the console, do a start

# service pbs_mom start
Starting PBS Mom:                                          [  OK  ]

Step 5b: Installing Torque as a service (pbs_server)

# cp /opt/xcat/share/xcat/netboot/add-on/torque/pbs_server /etc/init.d/pbs_server

Inside the pbs_server script, just ensure that the BASE_PBS_PREFIX point to the right directory

BASE_PBS_PREFIX=/opt/torque

Save and Exit.

At the console, start the pbs_server service

# service pbs_server start
Starting PBS Server:                                       [ OK ]

Step 5c: Installing Torque as a service (pbs_sched)

# cp /opt/xcat/share/xcat/netboot/add-on/torque/pbs_sched /etc/init.d/pbs_sched

Inside the pbs_sched script, just ensure that the BASE_PBS_PREFIX point to the right directory

 BASE_PBS_PREFIX=/opt/torque

Save and Exit.

At the console, start the pbs_sched service

# service pbs_sched start
Starting PBS Scheduler:                                    [ OK ]

B. Configuring the TORQUE Clients

Step 1a: Copy the torque package to the nodes using xCAT

# pscp torque-package-mom-linux-x86_64.sh compute:/tmp
# pscp torque-package-clients-linux-x86_64 compute:/tmp

Step 1b: Run the scripts

# psh compute "/tmp/torque-package*.x86_64.sh --install"

Step 2a. Copy the /etc/init.d/pbs_mom to compute nodes

# pscp /etc/init.d/pbs_mom compute:/etc/init.d
# psh compute "/sbin/service pbs_mom start"

Further Information:

  1. Configuring the Torque Default Queue
  2. Adding and Specifying Compute Resources at Torque

Resources:

  1. TORQUE installation overview

Enabling Torque for email notification

Step 1:

  1. Do look at the article Configuring CentOS 5 as an SMTP Mail Client with sendmail for configuring your Torque Server to become a SMTP Mail Client.

Step 2:

Ensure the Torque Server has this line

  1. “set server mail_from = adm”(You can replace adm with another useird of your choice). You may want to take a look at Setting up Torque Server on xCAT 2.x from Linux Toolkit

Step 3:

Finally, to ensure that the batch system can send an email to the user when the job start, end or abort, you have to set 2 options

  1. -m switch which define wh information send
  2. -M switch on where the information will be send

For example,

# Send notification when job starts.
#PBS -m b
# Send notification when job finishes and aborts.
#PBS -m ea
# Send notification when job starts, finishes and aborts.
#PBS -m bea

A typical submission script will be

#!/bin/bash
#PBS -N jobname
#PBS -j oe
#PBS -V
#PBS -m bea
#PBS -M kittycool@linucluster.wordpress.com
#PBS -l nodes=2:ppn=8

## pre-processing script
cd $PBS_O_WORKDIR
NCPUS=`cat $PBS_NODEFILE | wc -l`
echo $NCPUS

Commonly used qstat options

Commonly used Qstat Options

 Options Description
qstat -i Display jobs that are non-running in alternative format
qstat -r Display jobs that are running
qstat -n In addition to basic information, it also provide information of nodes allocated to the job listed.
qstat -u users(s) Display jobs of a user or users
qstat -Q Status of queues
qstat -Q -f Full status of queues in the alternative format
qstat -q Status of queues in the alternative format
qstat -B Batch server status
qstat -B -f Full batch server status including configuration