These are the steps to install OpenPBS on a linux cluster. The following refers to version 2.3.16.
Download and unpack the distribution. Run
the appropriate options: default settings are to install the package
in /usr/local (and in particular /usr/local/bin and /usr/local/sbin;
this should be fine for our purposes); to have the working directory
(PBS_HOME) in /usr/spool/pbs; to have fifo as the scheduler. Configure
--with-scp (for all things to run smoothly, root on
the server should be able to ssh into the execution hosts without
password, and vice versa).
Patches needed with gcc 3.2.2 (before configuring); in buildutils/makedepend-sh, modify the
eval $CPP… command at line 576 to include
grep -v ">$" | \ after
grep -v "$s\$" | \. Modify
src/lib/Liblog/pbs_log.c by adding
at the top. Same for src/server/svr_connect.c.
Cd into doc and
make install to install the man pages.
Make and make install. If you have mounted /usr/local on other machines for access to Server/Executable/Client applications you may need to change permissions on /usr/local/sbin/pbs_mom. Each slave will also need a minimal pbs_mom directory structure. Copy buildutils/pbs_mkdirs from the source directory to the execution hosts, then on each of them run
sh pbs_mkdirs mom sh pbs_mkdirs aux sh pbs_mkdirs default
In order, run
pbs_mom (they are in /usr/local/sbin). The first time
only, start pbs_server with the
-t create option.
Before starting the Moms, you need to create the config file PBS_HOME/mom_priv/config, containing (assuming your pbs server is tweedledee)
$logevent 0x1ff $clienthost tweedledee
Update : Rajarshi Guha suggests adding also
$clienthost tweedledee.caltech.edu $max_load 2.0 $ideal_load 1.0 $usecp tweedledee.caltech.edu:/home /home
where the first line is the fully qualified hostname of the server, the second and the third indicate the load above which a mom will not take a job (and the load that it will wait to have before taking new jobs, after refusing one), and the last line directs the moms to copy log files with cp (over NFS, assuming it's available) in case scp does not work. This config should go also to all the execution hosts. When this is done, start pbs_mom on all the hosts.
Update : When everything is started, run
qmgr and setup some options:
set s email@example.com set s acl_hosts=tweedledee.caltech.edu set s query_other_jobs=true create q standard queue_type=e set q standard resources_min.cput=1,resources_max.cput=12:00:00 set q standard resources_default.cput=30:00 set q standard enabled=true, started=true set s scheduling=true
Next, still in
qmgr, add all the nodes with
create node node-1 create node node-2 …
Update : I like to have scripts in /usr/local/bin that execute a command and copy a file over the cluster: they would look something like
for host in ‘cat /usr/spool/PBS/server_priv/nodes`; do echo -n "$host " ssh $host "$*" done
for host in `cat /usr/spool/PBS/server_priv/nodes`; do echo -n "$host " scp $1 $host:$2 done
[You’ll probably need to change permissions on the nodes file.] Now you're good to go!
© M. Vallisneri 2012 — last modified on 2010/01/29
Tantum in modicis, quantum in maximis