Basic Tracing of Jobs Issues in PBS Professional

Step 1: Proceed to the Head Node (Scheduler)

Once you have the Job ID you wish to investigate, go to the Head Node and do

% tracejob jobID

From the tracejob, you will be able to take a peek which node the job landed. Next you can go the node in question and find information from the mom_logs

% vim /var/spool/pbs/mom_logs/thedateyouarelookingat

For example,

% vim /var/spool/pbs/mom_logs/20201211

Using Vim, search for the Job ID

? yourjobID

You should be able to get a good hint of what has happened. In my case is that my nvidia drivers are having issues.

 

Resolving Altair Access Incorrect UserName and Password

If you are facing issues like “Incorrect UserName or Password” Do the following on the main system supporting the Visualisation Server (May or may not be the Server hosting Altair Access Services).

/etc/init.d/altairlmxd stop
/etc/init.d/altairlmxd start
/etc/init.d/pbsworks-pa restart

On the Altair Access Server,

/etc/init.d/guacd restart

 

 

Restrict Number of Queued and Running Jobs with PBS Professional

Allow maximum queued jobs limit at Server level

% qmgr -c "set server max_queued = [u:PBS_GENRIC=128]"

Apply maximum queued jobs limit at Queue Level

% qmgr -c "set queue your-queue-name max_queued = [u:PBS_GENRIC=128]"

Apply maximum Running jobs limit at Server Level

% qmgr -c "set server max_run = [u:PBS_GENRIC=128]"

Apply maximum running jobs limit at Queue Level

% qmgr -c "set queue your-queue-name max_run = [u:PBS_GENRIC=128]"