SGE_SHADOWD(8)
NAME
sge_shadowd - Sun Grid Engine shadow master daemon
SYNOPSIS
sge_shadowd
DESCRIPTION
sge_shadowd is a "light weight" process which can be run on so-called
shadow master hosts in a Sun Grid Engine cluster to detect failure of
the current Sun Grid Engine master daemon, sge_qmaster(8), and to
start-up a new sge_qmaster(8) on the host on which the sge_shadowd
runs. If multiple shadow daemons are active in a cluster, they run a
protocol which ensures that only one of them will start-up a new master
daemon.
The hosts suitable for being used as shadow master hosts must have
shared root read/write access to the directory $SGE_ROOT/$SGE_CELL/common as well as to the master daemon spool directory (by default
$SGE_ROOT/$SGE_CELL/spool/qmaster). The names of the shadow master
hosts need to be contained in the file
$SGE_ROOT/$xQS_NAME_Sxx_CELL/common/shadow_masters.
RESTRICTIONS
sge_shadowd may only be started by root.
ENVIRONMENT VARIABLES
- SGE_ROOT Specifies the location of the Sun Grid Engine standard
- configuration files.
- SGE_CELL If set, specifies the default Sun Grid Engine cell. To
- address a Sun Grid Engine cell sge_shadowd uses (in the
order of precedence):
The name of the cell specified in the environment variable SGE_CELL, if it is set.The name of the default cell, i.e. default. - SGE_DEBUG_LEVEL
- If set, specifies that debug information should be written to stderr. In addition the level of detail in which debug information is generated is defined.
- SGE_QMASTER_PORT
- If set, specifies the tcp port on which sge_qmaster(8) is expected to listen for communication requests. Most installations will use a services map entry for the service "sge_qmaster" instead to define that port.
- SGE_DELAY_TIME This variable controls the interval in which sge_shadowd
- pauses if a takeover bid fails. This value is used only when there are multiple sge_shadowd instances and they are contending to be the master. The default is 600 seconds.
- SGE_CHECK_INTERVAL
- This variable controls the interval in which the sge_shadowd checks the heartbeat file (60 seconds by default).
- SGE_GET_ACTIVE_INTERVAL
- This variable controls the interval when a sge_shadowd instance tries to take over when the heartbeat file has not changed.
FILES
- <sge_root>/<cell>/common
- Default configuration directory
- <sge_root>/<cell>/common/shadow_masters
- Shadow master hostname file.
- <sge_root>/<cell>/spool/qmaster
- Default master daemon spool directory
- <sge_root>/<cell>/spool/qmaster/heartbeat
- The heartbeat file.
SEE ALSO
sge_intro(1), sge_conf(5), sge_qmaster(8), Sun Grid Engine Installation and Administration Guide.
COPYRIGHT
- See sge_intro(1) for a full statement of rights and permissions.