HBOOT(1)
NAME
hboot - Start LAM on the local node.
SYNTAX
hboot [-dhstvNV] [-c <conf>] [-I <inet_topo>] [-R <rtr_topo>]
OPTIONS
-d Turn on debugging. This implies -v.
-h Print the command help menu.
-s Close stdio of child processes.
- -t Terminate (tkill(1)) any previous LAM session before
- starting.
- -v Be verbose.
- -N Go through the motions but do not actually take any ac
- tion.
- -V Format and print the process schema.
- -c <conf> Use <conf> as the process schema.
- -I <inet_topo> Set the $inet_topo variable in the process schema.
- -R <rtr_topo> Set the $rtr_topo variable in the process schema.
DESCRIPTION
Most MPI users will probably not need to use the hboot command; see
lamboot(1).
The hboot tool can be understood as a generic utility that starts multiple processes on the local node, based on information in a process
schema. It is not restricted to starting LAM. It is part of the
startup sequence preformed by lamboot(1).
A process schema is a description of the processes which constitute the
operating system on a given node. Naturally, the process schema used
by hboot should be the one that describes LAM on a node. The grammar
of the process schema is described in conf(5).
When starting LAM on a remote machine using rsh(1), the open file descriptors of the processes started by hboot must be closed in order for
rsh(1) to exit. This is done by using the -s option. The -t option
can be used to force a tkill(1) on the machine before attempting to
start LAM. This feature is used by lamboot(1) to handle the case where
a user might start a machine a second time without using lamwipe(1) to
terminate the previous LAM session.
The -I and -R options set their respective variables to the given values. The $inet_topo variable is typically used by the LAM Internet
datalinks that communicate with other nodes. The $rtr_topo variable is
passed to the LAM router that handles network and topology information.
The variables can also be set in the process schema file (see conf(5))
but their values are overridden by the command line options.
When LAM is started, the kernel records all processes that attach to
it, including all the processes in the process schema. It is the job
of tkill(1) to use this information to remove these processes from the
node.
EXAMPLES
- hboot -v
- Start LAM on the local node with the default process schema. Report about every step as it is done.
- hboot -c myconfig
- Boot the local node with the custom process schema, myconfig.
FILES
- laminstalldir/etc/lam-conf.lamd
- default node process schema, where "laminstalldir" is the directory where LAM/MPI was installed
- laminstalldir/etc/lam7.1.2helpfile
- Default location for help file for diagnostic messages that hboot may generate.
- /tmp/lam-$USER@<hostname> kill file for the LAM session on machine
- <hostname>, where $USER is the userid.
DIAGNOSTICS
Using ps(1) after hboot will display, among others, the LAM processes
that have been started. They may be killed one by one with kill(1), or
all at once by killing the LAM kernel process with a HUP signal. The
preferred method is to use the LAM tool tkill(1) which should kill them
all at once, and also remove the kill file. New users should make liberal use of ps(1) to gain confidence that the system is working properly. In a disaster, ps(1) and kill(1) are your only hope of recovery.