How to develop and test code with IBM bcx
OpenMPI is available.
Slurm is the batch mode job scheduler. Examples are srun and squeue.
We have access to icc 10.1, gcc 3.3.3, and open MPI 1.2rc1 by default.
To build non-mpi exec, execute make in source (for g++), sys_gcc (for g++), or sys_icc (for icc).
To build MPI enabled, do make in source/sys_mpi_icc (the default mpiCC is based on icc 10.1 on the bcx).
running the test suite
To run the test suite as a parallel batch job do
srun -b -n <nn> run_parallel.pl sys_gcc bcx
where <nn> is the number of cpus you want. The run_parallel.pl script has a complete description of how to run on this machine. The number of processors <nn> is set by the load leveling across the test suite. Peter suggests <nn> = 16-20 for auto, <nn> = 8 for slow, and <nn> = 12 for the combined pair of test suites which are run with the run_parallel.pl script in tsuite.
running a single model in batch mode
I created a script, brun, which is on my path. It contains
srun -b /home/gary/cloudy/trunk/source/sys_icc/cloudy.exe -r $1
Then the script model.in could be computed with the command
a multi-way MPI rid/optimization run
This is a two step process. First create a batch job, then submit it using the batch processor.
The minimal batch script to run mpi cloudy would be something like this:
#!/bin/csh mpirun /home/gary/cloudy/trunk/source/sys_mpi_icc/cloudy.exe -r $1
If this is contained in mpirun.cs then it would be submitted as follows:
srun -b -n <nn> mpirun.cs feii
where <nn> is the number of processors you want. This will use an input file given by the last parameter on the srun command, feii.in in this case.
There are also the following options to the srun command to do something when a batch job when an event occurs:
Notify user by email when certain event types occur.
Valid type values are BEGIN, END, FAIL, ALL (any state change). The user to be notified
is indicated with --mail-user.
User to receive email notification of state changes as
defined by --mail-type. The default value is the submitting user.
So adding "--mail-type=END --mail-user=rporter@xxx" to the srun command would announce the end of the batch job.
other slurm options
will list all jobs in the queue belonging to user name
will cancel the run with job ID number <nn>
Return to main wiki page