admin(3)
NAME
HTML::Mason::Admin - Mason Administrator's Guide
DESCRIPTION
This guide is intended for the sys admin/web master in
charge of installing, configuring, or tuning a Mason sys
tem.
PIECES OF AN INSTALLATION
This section discusses the various files and directories
that play a part in Mason's configuration.
Config.pm
Config.pm contains global configuration options for Mason.
Makefile.PL will initially create the file based on your
environment, placing it in the "lib/HTML/Mason" subdirec
tory of the distribution. After that, you can edit it by
hand, following the comments inside.
"make install" copies Config.pm to your Perl library
directory (e.g. "/usr/lib/perl5/site_perl/HTML/Mason")
along with the other module files. This allows Mason
internally to grab the configuration data with ""use
HTML::Mason::Config"".
When upgrading from a previous version, "make install"
will maintain the previous Config.pm values.
Currently this file controls:
o Whether or not certain optional modules, such as
Time::HiRes, should be loaded for enhanced features.
o The type of DBM and the serialization method used for
Mason's data caching. If you plan to use data caching,
make sure that the DBM package is a good-quality one
(DB_File or GDBM_File).
httpd.conf, srm.conf
- Directives must be added to Apache's configuration files
to specify which requests should be handled through Mason,
and the handler used for those requests. As described in
HTML::Mason, a simple configuration looks like: - DocumentRoot /usr/local/www/htdocs
PerlRequire /usr/local/mason/handler.pl
DefaultType text/html
<Location />SetHandler perl-script
PerlHandler HTML::Mason - </Location>
- handler.pl
- This file contains startup code that initializes the par
ent Apache process. It also defines the handler used by
each child process to field Mason requests. See the synop
sis in HTML::Mason for a simple example. The next section discusses in detail how to configure this file.
CONFIGURING THE HANDLER SCRIPT
Handler.pl is the most important file in your Mason con
figuration. It is responsible for creating the three
Mason objects and supplying the many parameters that con
trol how your components are parsed and executed. It also
provides the opportunity to execute arbitrary code at
three important junctures: the server initialization, the
beginning of a request, and the end of a request. A wide
set of behaviors can be implemented with a mere few lines
of well-placed Perl in your handler.pl. In this section
we present the basics of setting up handler.pl as well as
some ideas for more advanced applications.
Creating the Mason objects
"handler.pl" creates three Mason objects: the Parser,
Interpreter, and ApacheHandler. The Parser compiles compo
nents into Perl subroutines; the Interpreter executes
those compiled components; and the Apache handler routes
mod_perl requests to Mason. These objects are created
once in the parent httpd and then copied to each child
process.
These objects have a fair number of possible parameters.
Only two of them are required, comp_root and data_dir;
these are discussed in the next two subsections. The vari
ous parameters are documented in the individual reference
manuals for each object: HTML::Mason::Parser,
HTML::Mason::Interp, and HTML::Mason::ApacheHandler.
- The advantage of embedding these parameters in objects is
that advanced configurations can create more than one set
of objects, choosing which set to use at request time.
For example, suppose you have a staging site and a produc
tion site running on the same web server, distinguishing
between them with a configuration variable called "ver
sion": - # Create Mason objects for staging site
my $parser1 = new HTML::Mason::Parser;
my $interp1 = new HTML::Mason::Interp (parser=>$pars - er1, ...);
my $ah1 = new HTML::Mason::ApacheHandler (interp=>$in - terp1);
- # Create Mason objects for production site
my $parser2 = new HTML::Mason::Parser;
my $interp2 = new HTML::Mason::Interp (parser=>$pars - er2, ...);
my $ah2 = new HTML::Mason::ApacheHandler (interp=>$in - terp2);
- sub handler {
...# Choose the right ApacheHandler
if ($r->dir_config('version') eq ' staging') {$ah1->handle_request($r);} else {$ah2->handle_request($r);} - }
- Component root
- Given a tree of component source files, the top of the
tree is called the component root and is set via the
"comp_root" parameter. In simple Mason configurations the
component root is the same as the server's DocumentRoot.
More complex configurations may specify several different
document roots under a single component root. - When Mason handles a request, the request filename
("$r->filename") must be underneath your component root -that way Mason has a legitimate component to start with.
If the filename is not under the component root, Mason
will place a warning in the error logs and return a 404.
Unfortunately if your component root or document root goes
through a soft link, Mason will have trouble comparing the
paths and will return 404. To fix this, set your document
root to the true path. - Component roots (multiple)
- If you are just starting out it is probably safe to skip
this section initially. - Starting in Mason 0.8 it is now possible to specify multi
ple component roots to be searched in the spirit of Perl's
"@INC". To do so you must specify a list of lists:
comp_root => [[key1, root1], [key2, root2], ...]- Each pair consists of a key and root. The key is a string
that identifies the root mnemonically to a component
developer. Keys are case-insensitive and must be distinct. - For example:
comp_root => [[private=>'/usr/home/joe/comps'],- [main=>'/usr/local/www/htdocs']]
- This specifies two component roots, a main component tree
and a private tree which overrides certain components.
The order is respected ala "@INC", so private is searched first and main second. (I chose the "=>" notation here
because it looks cleaner, but note that this is a list of
lists, not a hash.) - The key has several purposes. Object and data cache file
names use the (uppercased) key to make sure different com
ponents sharing the same path have different cache and
object files. For example, if a component /foo/bar is
found in 'private', then the object file will be
<data_dir>/obj/PRIVATE/foo/bar- and the cache file
<data_dir>/cache/PRIVATE+2ffoo+2fbar- The key is also included whenever Mason prints the compo
nent title, as in an error message:
error while executing /foo/bar [private]:
...- This lets you know which version of the component was run
ning. - Data directory
- The data directory is where Mason keeps various files to
help implement caching, debugging, etc. You specify a sin
gle data directory via the "data_dir" parameter and Mason
creates subdirectories underneath it as needed:
cache: data cache files
debug: debug files
etc: miscellaneous files
obj: compiled components
preview: preview settings files- These directories will be discussed in appropriate sec
tions throughout this manual. - External modules
- Components will often need access to external Perl mod
ules. Any such modules that export symbols should be
listed in handler.pl, rather than the standard practice of
using a PerlModule configuration directive. This is
because components are executed inside the
HTML::Mason::Commands package, and can only access symbols exported to that package. Here's sample module list:
{ package HTML::Mason::Commands;use CGI ':standard';
use LWP::UserAgent;
... }- In any case, for optimal memory utilization, make sure all
Perl modules are used in the parent process, and not in
components. Otherwise, each child allocates its own copy
and you lose the benefit of shared memory between parent
processes and their children. See Vivek Khera's mod_perl
tuning FAQ (perl.apache.org/tuning) for details. - File ownership
- Unix web servers that run on privileged ports like 80
start with a root parent process, then spawn children run
ning as the 'User' and 'Group' specified in httpd.conf.
This difference leads to permission errors when child pro
cesses try to write files or directories created by the
parent process. - To work around this conflict, Mason remembers all directo
ries and files created at startup, returning them in
response to "interp->files_written". This list can be fed
to a chown() at the end of the startup code in "han
dler.pl":
chown (scalar(getpwnam "nobody"), scalar(getgrnam "nobody"),$interp->files_written); - Persistent user sessions
- With just a few lines in handler.pl you can make a global
hash (e.g. %session) available to all components contain ing persistent user session data. If you set a value in
the hash, you will see the change in future visits by the
same user. The key piece is Jeffrey Baker's Apache::Ses sion module, available from CPAN. - The file eg/session_handler.pl in the distribution con
tains the lines to activate cookie-based sessions using
Apache::Session and CGI::Cookie. You can use eg/ses_
sion_handler.pl as your new handler.pl base, or just copy out the appropriate pieces to your existing handler.pl. - The session code is customizable; you can change the user
ID location (e.g. URL instead of cookie), the user data
storage mechanism (e.g. DBI database), and the name of the
global hash. - Using global variables
- Global variables generally make programs harder to read,
maintain, and debug, and this is no less true for Mason.
Due to the persistent mod_perl environment, globals
require extra initialization and cleanup care. And the
highly modular nature of Mason pages does not mix well
with globals: it is no fun trying to track down which of
twenty components is stepping on your variable. With the
ability to pass parameters and declare lexical ("my")
variables in components, there is very little need for
globals at all. - That said, there are times when it is very useful to make
a value available to all Mason components: a DBI database
handler, a hash of user session information, the server
root for forming absolute URLs. Usually you initialize
the global in your handler.pl, either outside the han_
dler() subroutine (if you only need to set it once) or
inside (if you need to set it every request). - Mason by default parses components in "strict" mode, so
you can't simply start referring to a new global or you'll
get a fatal warning. The solution is to invoke "use vars"
inside the package that components execute in, by default
HTML::Mason::Commands:
{ package HTML::Mason::Commands;use vars qw($dbh %session);- }
- Alternatively you can use the Parser/allow_globals parame
ter or method:
my $parser = new HTML::Mason::Parser (..., allow_glob- als => [qw($dbh %session)]);
$parser->allow_globals(qw($foo @bar)) - The only advantage to "allow_globals" is that it will do
the right thing if you've chosen a different package for
components to run in (via the Parser/in_package Parser
parameter.) - Similarly, to initialize the variable in handler.pl, you
need to set it in the component package:
$HTML::Mason::Commands::dbh = DBI->connect(...);- Alternatively you can use the Interp/set_global Interp
method:
$interp->set_global(dbh => DBI->connect(...));- Again, "set_global" will do the right thing if you've cho
sen a different package for components. - Now when referring to these globals inside components, you
can use the plain variable name:
$dbh->prepare...- Declining image requests
- Mason should be prevented from serving images, tarballs,
and other binary files as regular components. Such a file
may inadvertently contain a Mason character sequence such
as "<%", causing an error. - There are several ways to restrict which file types are
handled by Mason. One way is with a line at the top of
handler(), e.g.:
return -1 if $r->content_type && $r->content_type !~- m|^text/|i;
- This line allows text/html and text/plain to pass through
but not much else. It is included (commented out) in the
default handler.pl. - Another way is specifying a filename pattern in the Apache
configuration, e.g.:
<FilesMatch "(.html|.txt|^[^.]+)$>
SetHandler perl-script
PerlHandler HTML::Mason
</FilesMatch>- This directs Mason to handle only files with .html, .txt,
or no extension. - Securing top-level components
- Users may exploit a server-side scripting environment by
invoking scripts with malicious or unintended arguments.
Mason administrators need to be particularly wary of this
because of the tendency to break out "subroutines" into
individually accessible file components. - For example, a Mason developer might create a helpful
shared component for performing sql queries:
$m->comp('sql_select', table=>'employee',- where=>'id=315');
- This is a perfectly reasonable component to create and
call internally, but clearly presents a security risk if
accessible via URL:
http://www.foo.com/sql_select?table=cred- it_cards&where=*
- Of course a web user would have to obtain the name of this
component through guesswork or other means, but obscurity
alone does not properly secure a system. Rather, you
should choose a site-wide policy for distinguishing toplevel components from private components, and make sure
your developers stick to this policy. You can then prevent
private components from being served. - One solution is to place all private components inside a
directory, say /private, that lies under the component
root but outside the document root. - Another solution is to decide on a naming convention, for
example, that all private components begin with "_", or
that all top-level components must end in ".html". Then
turn all private requests away with a 404 NOT_FOUND
(rather than, say, a 403 FORBIDDEN which would provide
more information than necessary). Use either an Apache
directive
<FilesMatch "^_">
SetHandler perl-script
PerlHandler "sub { return 404 }"
</FilesMatch>- or a handler.pl directive:
return 404 if $r->filename =~ m{_[^/]+$};- Even after you've safely protected internal components,
top-level components that process arguments (such as form
handlers) still present a risk. Users can invoke such a
component with arbitrary argument values via a handcrafted
query string. Always check incoming arguments for validity
and never place argument values directly into SQL, shell
commands, etc. Unfortunately, Mason does not yet work with
with Perl's taint checking which would help ensure these
principles. - Allowing directory requests
- By default Mason will decline requests for directories,
leaving Apache to serve up a directory index or a FORBID
DEN as appropriate. Unfortunately this rule applies even
if there is a dhandler in the directory: /foo/bar/dhandler
does not get a chance to handle a request for /foo/bar. - If you would like Mason to handle directory requests, do
the following: - 1. Set the ApacheHandler/decline_dirs ApacheHandler param
eter to 0. - 2. If your handler.pl contains the standard "return -1"
line to decline non-text requests (as given in the previ
ous section), add a clause allowing directory types:
return -1 if $r->content_type && $r->content_type !~- m|^text/|i
&& $r->content_type !~ m|directory$|i; - The dhandler that catches a directory request is responsi
ble for setting a reasonable content type.
STANDARD FEATURES
This section explains how standard Mason features work and
how to administer them.
Data caching
- Setup
- Cache files are implemented using MLDBM, an interface
for storing persistent multi-level data structures.
MLDBM, in turn, uses one of several DBM packages (DB_File, GDBM) and one of several serialization mech anisms (Data::Dumper, FreezeThaw or Storable). Mason's Config.pm controls which packages are used. - The most important task is selecting a good DBM pack
age. Most standard DBM packages (SDBM, ODBM, NDBM) are unsuitable for data caching due to significant
limitations on the size of keys and values. Perl only
comes with SDBM, so you'll need to obtain a good-qual ity package if you haven't already. At this time the
best options are Berkeley DB (DB_File) version 2.x, available at www.sleepycat.com, and GNU's gdbm (GDBM), available at GNU mirror sites everywhere. Stay away
from Berkeley DB version 1.x on Linux which has a
serious memory leak (and is unfortunately preinstalled on many distributions). - As far as the serialization methods, all of them
should work fine. Data::Dumper is probably simplest: it comes with the latest versions of Perl, is required
by Mason anyway, and produces readable output (possi
bly useful for debugging cache files). On the other
hand Storable is significantly faster than the other options according to the MLDBM documentation. - Data caching will not work on systems lacking flock(), such as Windows 95 and 98.
- Administration
Once set up, data caching requires little administra
tion. When a component calls "$m->cache" or
"$m->cache_self" for the first time, Mason automati
cally creates a new cache file under "data_dir/cache".
The name of the file is determined by encoding the
path as follows:
s/([^792 - like URL encoding with a '+' escape character. For
example, the cache file for component "/foo/bar" is
"data_dir/cache/foo+2fbar". - Currently Mason never deletes cache files, not even
when the associated component file is modified. (This
may change in the near future.) Thus cache files hang
around and grow indefinitely. You may want to use a
cron job or similar mechanism to delete cache files
that get too large or too old. For example:
# Shoot cache files more than 30 days old
foreach (<data_dir/cache>) { # path to cache - directory
unlink $_ if (-M >= 30); - }
- In general you can feel free to delete cache files
periodically and without warning, because the data
cache mechanism is explicitly not guaranteed -- devel
opers are warned that cached data may disappear any
time and components must still function. - If some reason you want to disable data caching, spec
ify "use_data_cache"=>0 to the Interp object. This
will cause all "$m->cache" calls to return undef with
out doing anything. - Debugging
- A debug file is a Perl script that creates a fake Apache
request object ("$r") and calls the same PerlHandler that
Apache called. Debug files are created under
"data_dir/debug/<username>" for authenticated users, oth
erwise they are placed in "data_dir/debug/anon". Several
ApacheHandler parameters are required to activate and con
figure debug files: - debug_mode
The debug_mode parameter indicates which requests
should produce a debug file: "all", "none", or "error"
(only if a error occurs). - debug_perl_binary
The full path to your Perl binary -- e.g.
"/usr/bin/perl". This is used in the Unix "shebang"
line at the top of each debug file. - debug_handler_script
The full path to your "handler.pl" script. Debug files
invoke "handler.pl" just as Apache does as startup, to
load needed modules and create Mason objects. - debug_handler_proc
The name of the request handler defined in "han
dler.pl". This routine is called with the saved Apache
request object. - Here's a sample "ApacheHandler" constructor with all debug
options:
my $ah = new HTML::Mason::ApacheHandler (interp=>$in- terp,
debug_mode=>'all',
debug_perl_binary=>'/usr/local/bin/perl',
debug_handler_script=>'/usr/local/ma - son/eg/handler.pl',
debug_handler_proc=>'HTML::Mason::han - dler');
- When replaying a request through a debug file, the global
variable "$HTML::Mason::IN_DEBUG_FILE" will be set to 1.
This is useful if you want to omit certain flags (like
preloading) in handler.pl when running under debug. For
example:
my %extra_flags = ($HTML::Mason::IN_DEBUG_FILE) ? () :- (preloads=>[...]);
my $interp = new HTML::Mason::Interp (..., %ex - tra_flags);
- Previewer
- The previewer is a web based utility that allows site
developers to: - 1. View a site under a variety of simulated client condi
tions: browser, operating system, date, time of day,
referer, etc. - 2. View a debug trace of a page, showing the component
call tree and indicating which parts of the page are
generated by which components. - The web-based previewer interface (a single component,
actually) allows the developer to select a variety of
options such as time, browser, and display mode. The set
of these options together is called a previewer configura
tion. Configurations can be saved under one of several
preview ports. For more information on how the previewer
is used, see HTML::Mason::Devel. - Follow these steps to activate the Previewer:
- 1. Choose a set of preview ports, for example, 3001 to
3005. - 2. In httpd.conf, put a Listen in for each port. E.g.
Listen your.site.ip.address:3001
...
Listen your.site.ip.address:3005- You'll also probably want to restrict access to these
ports in your access.conf. If you have multiple site
developers, it is helpful to use username/password
access control, since the previewer will use the user
name to keep configurations separate. - 3. In your "handler.pl", add the line
use HTML::Mason::Preview;- somewhere underneath "use HTML::Mason". Then add code
to your handler routine to intercept Previewer
requests on the ports defined above. Your handler
should end up looking like this:
sub handler {
my ($r) = @_;- # Compute port number from Host header
my $host = $r->header_in('Host');
my ($port) = ($host =~ /:([0-9]+)$/);
$port = 80 if (!defined($port)); - # Handle previewer request on special ports
if ($port >= 3001 && $port <= 3005) {
my $parser = new HTML::Mason::Parser(...);
my $interp = new HTML::Mason::Interp(...);
my $ah = new HTML::Mason::ApacheHandler - (...);
return HTML::Mason::Preview::handle_pre - view_request($r,$ah);
- } else {
$ah->handle_request($r); # else, normal - request handler
- }
- }
- The three "new" lines inside the if block should look
exactly the same as the lines at the top of "han
dler.pl". Note that these separate Mason objects are
created for a single request and discarded. The reason
is that the previewer may alter the objects' settings,
so it is safer to create new ones every time. - 4. Copy the Previewer component ("samples/preview") to
your component root (you may want to place it at the
top level so that http://www.yoursite.com/preview
calls up the previewer interface). Edit the "CONFIGU
RATION" block at the top to conform to your own Mason
setup. - To test whether the previewer is working: restart your
server, go to the previewer interface, and click "View".
You should see your site's home page. - System logs
- Mason will log various events to a system log file if you
so desire. This can be useful for performance monitoring
and debugging. - The format of the system log was designed to be easy to
parse by programs, although it is not unduly hard to read
for humans. Every event is logged on one line. Each line
consists of multiple fields delimited by a common separa
tor, by default ctrl-A. The first three fields are always
the same: time, the name of the event, and the current pid
($$). These are followed by one or more fields specific
to the event. - The events are:
EVENT NAME DESCRIPTION EXTRA- FIELDS
- REQ_START start of HTTP request request
- number, URL + query string
REQ_END end of HTTP request request - number, error flag (1 iferror oc
- curred, 0 otherwise)
- CACHE_READ attempt to read from component
- path, cache key, success
data cache (C<$m-E<gt>cache>) flag (1 if - item found, 0 otherwise)
- CACHE_STORE store to data cache component
- path, cache key
COMP_LOAD component loaded into memory component - path
for first time - The request number is an incremental value that uniquely
identifies each request for a given child process. Use it
to match up REQ_START/REQ_END pairs. - To turn on logging, specify a string value to "sys
tem_log_events" containing one or more event names sepa
rated by '|'. In additional to individual event names, the
following names can be used to specify multiple events:
REQUEST = REQ_START | REQ_END
CACHE = CACHE_READ | CACHE_STORE
ALL = All events- For example, to log REQ_START, REQ_END, and COMP_LOAD
events, you could use
system_log_events => "REQUEST|COMP_LOAD" Note that - this is a string, not a set of constants or'd together.
- Configuration Options
- By default, the system log will be placed in
data_dir/etc/system.log. You can change this with "sys
tem_log_file". - The default line separator is ctrl-A. The advantage of
this separator is that it is very unlikely to appear in
any of the fields, making it easy to split() the line.
The disadvantage is that it will not always display, e.g.
from a Unix shell, making the log harder to read casually.
You can change the separator to any sequence of characters
with "system_log_separator". - The time on each log line will be of the form "sec
onds.microseconds" if you are using Time::HiRes, and sim
ply "seconds" otherwise. See "Config.pm" section. - Sample Log Parser
- Here is a code skeleton for parsing the various events in
a log. You can also find this in eg/parselog.pl in the Mason distribution.
open(LOG,"mason.log");
while (<LOG>) {
chomp;
my (@fields) = split(" my - ($time,$event,$pid) = splice(@fields,0,3);
if ($event eq 'REQ_START') {
my ($reqnum,$url) = @fields;
... - } elsif ($event eq 'REQ_END') {
my ($reqnum,$errflag) = @fields;
... - } elsif ($event eq 'CACHE_READ') {
my ($comp,$key,$hitflag) = @fields;
... - } elsif ($event eq 'CACHE_STORE') {
my ($comp,$key) = @fields;
... - } elsif ($event eq 'COMP_LOAD') {
my ($comp) = @fields;
... - } else {
warn "unrecognized event type: $event0; - }
- }
- Suggested Uses
- Performance: REQUEST events are useful for analyzing the
performance of all Mason requests occurring on your site,
and identifying the slowest requests. eg/perflog.pl in the Mason distribution is a log parser that outputs the aver
age compute time of each unique URL, in order from slowest
to quickest. - Server activity: REQUEST events are useful for determining
what your web server children are working on, especially
when you have a runaway. For a given process, simply tail
the log and find the last REQ_START event with that pro
cess id. (You can also use the Apache status page for
this.) - Cache efficiency: CACHE events are useful for monitoring
cache "hit rates" (number of successful reads over total
number of reads) over all components that use a data
cache. Because stores to a cache are more expensive than
reads, a high hit rate is essential for the cache to have
a beneficial effect. If a particular cache hit rate is too
low, you may want to consider changing how frequently it
is expired or whether to use it at all. - Load frequency: COMP_LOAD events are useful for monitoring
your code cache. Too many loads may indicate that your
code cache is too small. Also, if you can turn off the
code cache for a short time, COMP_LOAD events will tell
you which components are loaded most often and thus good
candidates for preloading.
PERFORMANCE TUNING
This section explains Mason's various performance enhance
ments and how to administer them.
Code cache
When Mason loads a component, it places it in a memory
cache.
The maximum size of the cache is specified with the
Interp/code_cache_max_size Interp parameter; default is
10MB. When the cache fills up, Mason frees up space by
discarding a number of components. The discard algorithm
is least frequently used (LFU), with a periodic decay to
gradually eliminate old frequency information. In a nut
shell, the components called most often in recent history
should remain in the cache. Very large components (over
20% of the maximum cache size) never get cached, on the
theory that they would force out too many other compo
nents.
Note that the "size" of a component in memory cannot lit
erally be measured. It is estimated by the length of the
source text plus some overhead. Your process growth will
not match the code cache size exactly.
You can monitor the performance of the memory cache by
turning on system logs and counting the COMP_LOAD events.
If these are occurring frequently even for a long-running
process, you may want to increase the size of your code
cache.
You can prepopulate the cache with components that you
know will be accessed often; see Preloading. Note that
preloaded components possess no special status in the
cache and can be discarded like any others.
Naturally, a cache entry is invalidated if the correspond
ing component source file changes.
To turn off code caching completely, set
Interp/code_cache_max_size to 0.
Object files
The in-memory code cache is only useful on a per-process
basis. Each process must build and maintain its own
cache. Shared memory caches are conceivable in the future,
but even those will not survive between web server
restarts.
As a secondary, longer-term cache mechanism, Mason stores
a compiled form of each component in an object file under
"data_dir/obj/component-path". Any server process can eval
the object file and save time on parsing the component
source file. The object file is recreated whenever the
source file changes.
Besides improving performance, object files are essential
for debugging and interpretation of errors. Line numbers
in error messages are given in terms of the object file.
The curious-minded can peek inside an object file to see
exactly how Mason converted a given component to a Perl
object.
If you change any Parser options, you must remove object
files previously created under that parser for the changes
to take effect.
If for some reason you don't want Mason to create object
files, set the Interp/use_object_files Interp parameter to
0.
Preloading
You can tell Mason to preload a set of components in the
parent process, rather than loading them on demand, using
the Interp/preloads Interp parameter. Each child server
will start with those components loaded in the memory
cache. The trade-offs are:
- time
- a small one-time startup cost, but children save time
by not having to load the components - memory
a fatter initial server, but the memory for preloaded
components are shared by all children. This is simi
lar to the advantage of using modules only in the par
ent process. - Try to preload components that are used frequently and do
not change often. (If a preloaded component changes, all
the children will have to reload it from scratch.) - Reload file
- Normally, every time Mason executes a component, it checks
the last modified time of its source file to see if it
needs to be reloaded. These file checks are convenient
for development, but for a production site they degrade
performance unnecessarily. - To remedy this, Mason has an accelerated mode that changes
its behavior in two ways: - 1. Does not check component source files at all, relying
solely on object files. This means the developer or an
automated system is responsible for recompiling any compo
nents that change and recreating object files, using the
Parser/make_component Parser method. - 2. Rather than continuously checking whether object files
have changed, Mason monitors a "reload file" containing an
ever-growing list of components that have changed. When
ever a component changes, the developer or an automated
system is responsible for appending the component path to
the reload file. The reload file is kept in
"data_dir/etc/reload.lst". - You can activate this mode with the Interp/use_reload_file
Interp method. - The advantage of using this mode is that Mason stats one
file per request instead of ten or twenty. The disadvan
tage is a increase in maintenance costs as the object and
reload files have to be kept up-to-date. Automated edito
rial tools, and cron jobs that periodically scan the com
ponent hierarchy for changes, are two possible solutions.
The Mason content management system automatically handles
this task.
STAGING vs. PRODUCTION
Site builders often maintain two versions of their sites:
the production (published) version visible to the world,
and the development (staging) version visible internally.
Developers try out changes on the staging site and push
the pages to production once they are satisfied.
The priorities for the staging site are rapid development
and easy debugging, while the main priority for the pro
duction site is performance. This section describes vari
ous ways to adapt Mason for each case.
Out mode
Mason can spew data in two modes. In "batch" mode Mason
computes the entire page in a memory buffer and then
transmits it all at once. In "stream" mode Mason outputs
data as soon as it is computed. (This does not take into
account buffering done by Apache or the O/S.) The default
mode is "batch".
Batch mode has the advantage of better error handling.
Suppose an error occurs in the middle of a page. In stream
mode, the error message interrupts existing output, often
appearing in an awkward HTML context such as the middle of
a table which never gets closed. In batch mode, the error
message is output neatly and alone.
Batch mode also offers more flexibility in controlling
HTTP headers (see Devel/sending_http_headers) and in han
dling mid-request error conditions (see
Request/clear_buffer).
Stream mode may help get data to the browser more quickly,
allowing server and browser to work in parallel. It also
prevents memory buildup for very large responses.
Since Apache does its own buffering, stream mode does not
entail immediate delivery of output to the client. You
must set $|=1 to turn off Apache buffering completely
(generally not a good idea) or call "$m->flush_buffer" to
flush the buffer selectively.
In terms of making your server seem responsive, the ini
tial bytes are most important. You can send these early
by calling "$m->flush_buffer" in key locations such as the
common page header. However, this dilutes the advantages
of batch mode mentioned above. Tradeoffs...
You control output mode by setting "interp->out_mode" to
"batch" or "stream".
Error mode
When an error occurs, Mason can respond by:
· showing a detailed error message in the browser
- · die'ing, which sends a 501 to the browser and lets the
- error message go to the error logs.
- The first option is ideal for development, where you want
immediate feedback on the error. The second option is
usually desired for production so that users are not
exposed to messy error messages. You control this option
by setting ah->error_mode to "html" or "fatal" respec
tively. - Debug mode
- As discussed in the debugging section, you can control
when Mason creates a debug file. While creating a debug
file is not incredibly expensive, it does involves a bit
of work and the creation of a new file, so you probably
want to avoid doing it on every request to a frequently
visited site. I recommend setting debug_mode to 'all' in
development, and 'error' or 'none' in production. - Reload files
- Consider reload files only for frequently visited produc
tion sites.
CONFIGURING VIRTUAL SITES
These examples extend the Mason/single site configuration
example in HTML::Mason.
Multiple sites, one component root
- If you want to share some components between your sites,
arrange your httpd.conf so that all DocumentRoots live
under a single component space: - # httpd.conf
PerlRequire /usr/local/mason/handler.pl - # Web site #1
<VirtualHost www.site1.com>
DocumentRoot /usr/local/www/htdocs/site1
<Location />
SetHandler perl-script
PerlHandler HTML::Mason - </Location>
- </VirtualHost>
- # Web site #2
<VirtualHost www.site2.com>
DocumentRoot /usr/local/www/htdocs/site2
<Location />
SetHandler perl-script
PerlHandler HTML::Mason - </Location>
- </VirtualHost>
- In handler.pl:
my $interp = new HTML::Mason::Interp (parser=>$parser,
comp_root=>'/usr/local/www/htdocs'
data_dir=>'/usr/local/mason/');- The directory structure for this scenario might look like:
/usr/local/www/htdocs/ # component root
+- shared/ # shared components
+- site1/ # DocumentRoot for first site
+- site2/ # DocumentRoot for second site- Incoming URLs for each site can only request components in
their respective DocumentRoots, while components inter
nally can call other components anywhere in the component
space. The shared/ directory is a private directory for
use by components, inaccessible from the Web. - Multiple sites, multiple component roots
- Sometimes your sites need to have completely distinct com
ponent hierarchies, e.g. if you are providing Mason ISP
services for multiple users. In this case the component
root must change depending on the site requested. Since
you can't change an interpreter's component root dynami
cally, you need to maintain separate Mason objects for
each site in the handler.pl:
my (%interp,%ah);
foreach my $site qw(...) {
$interp{$site} = new HTML::Mason::Interp- (comp_root=>"/usr/local/www/$site",...);
$ah{$site} = new HTML::Mason::ApacheHandler (in - terp=>$interp{$site},...);
- }
- ...
- sub handler {
my $site = $r->dir_config('site');
$ah{$site}->handle_request($r); - }
- We assume each virtual server configuration section has a
PerlSetVar site <site_name>- Above we pre-create all Mason objects in the parent.
Another scheme is to create objects on demand in the
child:
my (%interp,%ah);...sub handler {
my $site = $r->dir_config('site');
unless exists($interp{$site}) {
# get comp_root from PerlSetVar as well
my $comp_root = $r->dir_config('comp_root');
$interp{$site} = new HTML::Mason::In - terp(comp_root=>$comp_root,...);
$ah{$site} = new HTML::Mason::ApacheHan - dler(interp=>$interp{$site},...);
- }
- }
- The advantage of the second scheme is that you don't have
to hardcode as much information in the handler.pl. The
disadvantage is a slight memory and performance impact. On
development servers this shouldn't matter; on production
servers you may wish to profile the two schemes.
AUTHOR
Jonathan Swartz, swartz@pobox.com
SEE ALSO
- HTML::Mason, HTML::Mason::Parser, HTML::Mason::Interp,
HTML::Mason::ApacheHandler