wiki:MpiParallelC13

Cloudy C13 and before in parallel using a makefile or gnu parallel

Running a number of different models with a makefile

Christophe Morisset wrote a makefile that can be used to run a number of models in parallel. Follow these steps:

Create each simulation as a separate input script with names that can be identified with a wildcard search. I would use names ending in ".in" so that ls *.in finds all the input scripts you want to run. Examples might be "dog.in, cat.in, mouse.in, horse.in".

Download Christophe's makefile from here. Edit the Makefile to set the correct path to the Cloudy executable. To run the sims using N cores do

make -j N

The makefile includes an option to run only a subset of the models, by specifying part of the filenames. For instance, you could run the "model1*.in" set by specifying

make -j N name='model1'

Using Gnu parallel

Jane Rigby describes how to use GNU parallel to run many models in parallel in this post on the Yahoo group.


The optimize and grid commands on MPI clusters

The makefile method described above will run a number of different models and create a number of output files. The next method describes how to build and run Cloudy to run grids or optimizations using MPI. This method is limited to the grid and optimize commands. The resulting output will have the series of models concatenated into single files.

Building on an MPI system

First you need to make sure that you have MPI installed on your computer. You will need MPI version 2 or newer to run Cloudy. On Linux machines you will typically have packages for MPICH2, LAM/MPI, and/or Open MPI (the latter is a further development of LAM/MPI, which is now in maintenance-only mode). All of these support MPI-2.

The next step is to make sure that your account is aware of the MPI installation. On smaller systems this may involve the mpi-selector command, but this will depend on how your computer manager set up the system. The command mpi-selector --list will list the available choices, and you can select the MPI version of your choice with mpi-selector --set <name>. On HPC machines and clusters you may need to issue a module load command to make MPI visible. Several versions of MPI may be available. The command module avail should give a full list of all the available modules. The command module list will give a list of all the modules that are already loaded. When in doubt, contact your system administrator or helpdesk.

Next build Cloudy in one of the MPI directories. These are under the source directory and support GNU gcc (sys_mpi_gcc) and Intel icc (sys_mpi_icc). Your system manager will tell you whether to use the GNU or Intel compiler. Please also check the main compilation page for supported versions of g++ and icc. Most MPI distributions (but not all!) will provide convenient wrapper scripts for the compiler, typically called mpiCC or mpicxx. The make file will try to find the wrapper script or make a best effort compilation when that fails. If the compilation fails, please contact your sysadmin or helpdesk for further advice.

Running the code

On most systems the code should be executed with something like

mpirun -np 8 /path/to/cloudy/source/sys_mpi_gcc/cloudy.exe -r name

(the command may also be called mpiexec, orterun, etc...). The -np option specifies the number of ranks (cores), which is 8 in this example. For advice on how to choose the number of cores consult with your system manager. This depends on the number of cores per node, the amount of memory per core, etc.

Note that using the -r option (or -p option) is mandatory. Normal input redirection will not work with Cloudy in MPI runs! In the example above, the code will read its commands from name.in and write the main output to name.out. If you use the -p option, additionally the save output files will go to name<extension> files, where the <extension> part is stated in the save or punch commands.

The optimize and grid commands

Two Cloudy commands can take advantage of the MPI environment. They run Cloudy as an "embarrassingly parallel" application, putting one model on each rank.

The optimize command

The optimizer is described in a Chapter of Hazy 1. It makes it possible to specify an observed spectrum (and several other observables) and ask the code to reproduce it. A number of parameters can be varied to obtain the best fit to this spectrum.

The result of this run will be a single "best" model, the one that comes closest to reproducing the observations.

The optimizer cannot use more ranks than two times the number of free parameters p, so for optimal performance you should choose the number of ranks close to 2*p/n (with n an arbitrary number >= 1). Using more than 2*p ranks is pointless, unless you need the extra memory.

The grid command

The grid command, described in a Chapter of Hazy 1, makes it possible to vary input parameters to create large grids of calculations. Several parameters can be varied and the result of the calculation will be predictions for each of the grid points.

Output with these commands

Predictions are usually saved with one of the save commands described in Hazy 1. When run under MPI, the predictions will be brought together into large files which contain the grid points in the same order they would have had in a serial run (unless you specify the keyword separate, in which case the output from each grid point will be saved in a separate file).

There are two other useful options to consider.

The save grid command will save the parameters for each model in the grid. It will also help identifying failed grid points and separated save output. Make it a habit of always including this in grid runs.

The no hash option will prevent a hash string from separating different grid points.


Next step, CompileStars, is to compile the optional stellar atmospheres

Return to StepByStep instructions

Return to nublado.org

Last modified 11 months ago Last modified on 2016-12-26T12:16:55Z