mpirun(1)
NAME
mpirun - Run mpi programs
DESCRIPTION
- "mpirun" is a shell script that attempts to hide the dif
- ferences in starting jobs for various devices from the user.
- Mpirun attempts to determine what kind of machine it is running
- on and start the required number of jobs on that machine. On
- workstation clusters, if you are not using Chameleon, you must
- supply a file that lists the different machines that mpirun can
- use to run remote jobs or specify this file every time you run
- mpirun with the -machine file option. The default file is in
- util/machines/machines.<arch>.
- mpirun typically works like this
mpirun -np <number of processes> <program name and argu - ments>
- If mpirun cannot determine what kind of machine you are
- on, and it is supported by the mpi implementation, you can the
- -machine and -arch options to tell it what kind of machine you
- are running on. The current valid values for machine are
- chameleon (including chameleon/pvm, chameleon/p4, etc...)
meiko (the meiko device on the meiko)
paragon (the ch_nx device on a paragon not running NQS)
p4 (the ch_p4 device on a workstation cluster)
ibmspx (ch_eui for IBM SP2)
anlspx (ch_eui for ANLs SPx)
ksr (ch_p4 for KSR 1 and 2)
sgi_mp (ch_shmem for SGI multiprocessors)
cray_t3d (t3d for Cray T3D)
smp (ch_shmem for SMPs)
execer (a custom script for starting ch_p4 programs
without using a procgroup file. This script
currently does not work well with interactive
jobs) - You should only have to specify mr_arch if mpirun does not
- recognize your machine, the default value is wrong, or you are
- using the p4 or execer devices. The full list of options is
PARAMETERS
- The options for mpirun must come before the program you
- want to run and must be spelled out completely (no abreviations).
- Unrecognized options will be silently ignored.
- mpirun [mpirun_options...] <progname> [options...]
- -arch <architecture>
- - specify the architecture (must have matching ma
- chines.<arch> file in ${MPIR_HOME}/util/machines) if using the
- execer
- -h - This help
-machine <machine name> - - use startup procedure for <machine name>
- -machinefile <machine-file name>
- - Take the list of possible machines to run on from
- the file <machine-file name>
- -np <np>
- - specify the number of processors to run on
- -nolocal
- - do not run on the local machine (only works for
- p4 and ch_p4 jobs)
- -stdin filename
- - Use filename as the standard input for the pro
- gram. This is needed for programs that must be run as batch
- jobs, such as some IBM SP systems and Intel Paragons using NQS
- (see -paragontype below).
- -t - Testing - do not actually run, just print what
- would be executed
-v - Verbose - throw in some comments
-dbx - Start the first process under dbx where possible
-gdb - Start the first process under gdb where possible - (on the Meiko, selecting either -dbx or -gdb starts prun under
- totalview instead)
-xxgdb - Start the first process under xxgdb where possi - ble (-xdbx does not work)
-tv - Start under totalview
SPECIAL OPTIONS FOR NEC - CENJU-3
-batch - Excecute program as a batch job (using cjbr)
- -stdout filename
- - Use filename as the standard output for the pro
- gram.
- -stderr filename
- - Use filename as the standard error for the pro
- gram.
SPECIAL OPTIONS FOR NEXUS DEVICE
- -nexuspg filename
- - Use the given Nexus startup file instead of cre
- ating one. Overrides -np and -nolocal, selects -leave_pg.
- -nexusdb filename
- - Use the given Nexus resource database.
SPECIAL OPTIONS FOR WORKSTATION CLUSTERS
- -e - Use execer to start the program on workstation
- clusters
-pg - Use a procgroup file to start the p4 programs, - not execer (default)
-leave_pg - - Do not delete the P4 procgroup file after running
- -p4pg filename
- - Use the given p4 procgroup file instead of creat
- ing one. Overrides -np and -nolocal, selects -leave_pg.
- -tcppg filename
- - Use the given tcp procgroup file instead of cre
- ating one. Overrides -np and -nolocal, selects -leave_pg.
- -p4ssport num
- - Use the p4 secure server with port number num to
- start the programs. If num is 0, use the value of the environ
- ment variable MPI_P4SSPORT. Using the server can speed up pro
- cess startup. If MPI_USEP4SSPORT as well as MPI_P4SSPORT are
- set, then that has the effect of giving mpirun the -p4ssport 0
- parameters.
SPECIAL OPTIONS FOR BATCH ENVIRONMENTS
- -mvhome
- - Move the executable to the home directory. This
- is needed when all file systems are not cross-mounted. Currently
- only used by anlspx
- -mvback files
- - Move the indicated files back to the current di
- rectory. Needed only when using -mvhome; has no effect other
- wise.
- -maxtime min
- - Maximum job run time in minutes. Currently used
- only by anlspx. Default value is 15 minutes
- -nopoll
- - Do not use a polling-mode communication. Avail
- able only on IBM SPx.
- -mem value
- - This is the per node memory request (in Mbytes).
- Needed for some CM-5s.
- -cpu time
- - This is the the hard cpu limit used for some
- CM-5s in minutes.
SPECIAL OPTIONS FOR IBM SP2
- -cac name
- - CAC for ANL scheduler. Currently used only by
- anlspx. If not provided will choose some valid CAC.
SPECIAL OPTIONS FOR INTEL PARAGON
- -paragontype name
- - Selects one of default, mkpart, NQS, depending on
- how you want to submit jobs to a Paragon.
- -paragonname name
- - Remote shells to name to run the job (using the
- -sz method) on a Paragon.
- -paragonpn name
- - Name of partition to run on in a Paragon (using
- the -pn name command-line argument)
RETURN VALUE
- On exit, mpirun returns a status of zero unless mpirun de
- tected a problem, in which case it returns a non-zero status
- (currently, all are one, but this may change in the future).
SPECIFYING HETEROGENEOUS SYSTEMS
- Multiple architectures may be handled by giving multiple
- -arch and -np arguments. For example, to run a program on 2
- sun4s and 3 rs6000s, with the local machine being a sun4, use
mpirun -arch sun4 -np 2 -arch rs6000 -np 3 program - This assumes that program will run on both architectures.
- If different executables are needed (as in this case), the string
- %a will be replaced with the arch name. For example, if the pro
- grams are program.sun4 and program.rs6000 , then the command is mpirun -arch sun4 -np 2 -arch rs6000 -np 3 program.%a
- If instead the execuables are in different directories;
- for example, /tmp/me/sun4 and /tmp/me/rs6000 , then the command
- is
mpirun -arch sun4 -np 2 -arch rs6000 -np 3 /tmp/me/%a/pro - gram
- It is important to specify the architecture with -arch
- before specifying the number of processors. Also, the first
- -arch command must refer to the processor on which the job will
- be started. Specifically, if -nolocal is not specified, then the
- first -arch must refer to the processor from which mpirun is run
- ning.
- (You must have machines.<arch> files for each arch that
- you use in the util/machines directory.)
- Another approach that may be used the the ch_p4 device is
- to create a procgroup file directly. See the MPICH Users Guide
- for more information.
LOCATION
- /home/MPI/mansrc/commands
- 7/26/2004