Information about TIFR-CAM cluster
Hardware and software
Torque/PBS
The nodes are put into two groups, see the file /opt/torque/server_priv/nodes
PBS is a batch handling system to manage parallel applications submitted by users. On the cluster, PBS uses Maui as the scheduler. Jobs are submitted to PBS using a script; examples are given below under the openmpi and mpich2 sections. If the script is called famosa.pbs, you can submit the job to PBS using
$ qsub famosa.pbs
Here are some parameters that can be given in a PBS script file:
* -N jobname (name the job)
* -q @nic-cluster.cc.umr.edu (The cluster address to send the job to)
* -e errfile (redirect standard error to a file named errfile)
* -o outfile (redirect standard output to a file named outfile)
* -j oe (combine standard output and standard error)
* -l walltime=N (request a walltime of N in the form hh:mm:ss)
* -l cput=N (request N sec of CPU time; or in the form hh:mm:ss)
* -l mem=N[KMG][BW] (request total N kilo| mega| giga} {bytes|words} of memory on all requested processors together)
* -l nodes=N:ppn=M (request N nodes with M processors per node)
* -m abe (mail the user when the job aborts/began running/ended)
* -S shell (use shell instead of your login shell to interpret the batch script; must include a complete path)
* -V (job inherits the full environment of the current shell, including $DISPLAY)
Once a job is submitted, you can check its status using qstat
[praveen@master piaggio_pso]$ qstat
Job id Name User Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
25.master 20080919_3 roms 22:58:50 R default
28.master FAMOSA praveen 00:11:11 R default
To get more detailed information, use qstat -f or qstat -f <jobid>
To delete a running job, use
$ qdel <jobid>
If the job is not killed by the above command, then force it using
$ qdel -p <jobid>
Note: qpeek was not working when torque was installed from Rocks 5. It would give an error that there is no file in /opt/torque/spool. After commenting line 142 in /opt/torque/bin/qpeek, it works.
Some PBS trouble shooting
Sometimes PBS or maui may not be started properly. To start PBS
/sbin/service pbs_server start
If PBS is working but job is being queued and never starts, then maui may not be running. Start it
/usr/local/maui/sbin/maui
If qstat does not show all jobs but only shows yours, then the settings have to be changed like this which needs root access
qmgr -ac "set server query_other_jobs = True"
Selecting MPI version
There are many versions of mpi installed. To see available versions
[praveen@turing ~]$ mpi-selector --list
mvapich2_gcc-1.6
mvapich_gcc-1.2.0
openmpi_gcc-1.4.3
openmpi_gcc44-1.4.3
pgimpi
I recommend using openmpi_gcc44-1.4.3 since it works well with PBS. You set this by
[praveen@turing ~]$ mpi-selector --set openmpi_gcc44-1.4.3
Defaults already exist; overwrite them? (y/N) y
You need to logout and login for the paths to be updated. You can check your mpi setting by
[praveen@turing ~]$ mpi-selector --query
default:openmpi_gcc44-1.4.3
level:user
Also check the PATH have been set correctly; you should get following output.
[praveen@turing ~]$ which mpirun
/opt/openmpi-1.4.3/bin/mpirun
OpenMPI
Openmpi was compiled with the following configure options
./configure --with-tm=/opt/torque --prefix=/opt/openmpi-1.2.7 \
--enable-prefix-by-default --enable-static
After compiling, check that all required features are enabled using ompi_info. In particular, to verify that torque support is built in, do
[praveen@master ]$ /opt/openmpi-1.2.7/bin/ompi_info |grep tm
MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.2.7)
MCA ras: tm (MCA v1.0, API v1.3, Component v1.2.7)
MCA pls: tm (MCA v1.0, API v1.3, Component v1.2.7)
The following is an example PBS script for use with openmpi. Set PATH and LD_LIBRARY_PATH if they are not set in your .bashrc file. Here I have commented them.
#PBS -N "rae2822"
#PBS -l "nodes=5:hardy:ppn=6"
#PBS -l "walltime=48:00:00"
#PBS -j oe
#PBS -o famosa.log
#PBS -m e
#export OPENMPI=/opt/openmpi-1.2.7
#export PATH=$OPENMPI/bin:$PATH
#export LD_LIBRARY_PATH=$OPENMPI/lib
cd $PBS_O_WORKDIR
mpirun $HOME/src/famosa/build/bin/Famosa_mpi
MPICH2
Mpich2 is installed using Rocks in /opt/mpich2/gnu and uses gfortran as the fortran compiler.
Using /opt/mpich2/gnu/bin/mpirun
#PBS -N "FAMOSA"
#PBS -l "nodes=5:hardy:ppn=6"
#PBS -l "walltime=00:10:00"
#PBS -j oe
#PBS -o "famosa.log"
#PBS -m e
export LD_LIBRARY_PATH=/opt/mpich2/gnu/lib
export PATH=/opt/mpich2/gnu/bin:$PATH
# got to working directory
cd $PBS_O_WORKDIR
# run mpd demon on all nodes
N_ALL=`cat $PBS_NODEFILE | wc -l`
N_UNI=`sort -u < $PBS_NODEFILE | wc -l`
cp $PBS_NODEFILE ./nodes_all.txt
sort -u < $PBS_NODEFILE > nodes_unique.txt
mpdboot -n $N_UNI -f nodes_unique.txt
sleep 10
mpirun -n $N_ALL -machinefile nodes_all.txt ~/src/famosa/build/bin/Famosa_mpi
mpdallexit
Using /opt/mpiexec/bin/mpiexec
Use mpiexec in /opt/mpiexec to launch mpich2 programs together with PBS. An example script is given below
#PBS -N "rae2822"
#PBS -l "nodes=5:hardy:ppn=6"
#PBS -l "walltime=48:00:00"
#PBS -j oe
#PBS -o famosa.log
#PBS -m e
cd $PBS_O_WORKDIR
/opt/mpiexec/bin/mpiexec --comm=pmi $HOME/src/famosa/build/bin/Famosa_mpi
Useful commands
cluster-fork
This command can be use to execute something on all nodes. For example to see the list of processes for user praveen, do
cluster-fork ps -U praveen
To run some command only on a particular set of nodes, use
cluster-fork -n "c0-0 c0-1 c0-2 c0-3 c0-4" ps -U praveen
Another was is to use
cluster-fork --nodes="c0-%d:5-14" ps -U praveen
checkjob
This command gives some information about a submitted job
checkjob -v <JOBID>
where JOBID is given by qstat.
showq
showq gives a concise summary of all jobs running or in the queue.
showscript
showscript will return the contents of the PBS script that you have submitted. The only argument is the job's PBS jobid.
mjobctl
You can use this to suspend or resume a PBS job. See the help
[praveen@master ]$ mjobctl --help
Usage: mjobctl [FLAGS]
--about
--configfile=<FILENAME>
--format=<FORMAT>
--help
--host=<SERVERHOSTNAME>
--keyfile=<FILENAME>
--loglevel=<LOGLEVEL>
--port=<SERVERPORT>
--version
-c <JOBID> // CANCEL
-C <JOBID> // CHECKPOINT
-h <JOBID> // HOLD
-r <JOBID> // RESUME
-R <JOBID> // REQUEUE
-s <JOBID> // SUSPEND
-S <JOBID> // SUBMIT
-x <JOBID> // EXECUTE