lurkftp(1)

NAME

lurkftp v1.00 - monitor and/or mirror FTP sites

SYNOPSIS

lurkftp [options] [site [dir] .. ] ..

DESCRIPTION

Lurkftp is the ultimate FTP site lurker and mirroring pro
gram. It will monitor changes in source directories and either
just report changes or mirror changes into a destination directo
ry.
Lurkftp in its most basic mode takes site/directory-list
"pairs", as follows:
site1 /pub/dir1
site2 /pub/dir2a /pub/dir2b
site3 dir3
These pairs are either separated by newlines (if in an op
tion file), by the -+ option (e.g. "lurkftp site1 /pub/dir1 -+
site2 /pub/dir2"), or by the EOF of an option file.
Once all options are parsed, processing begins with the
first pair and (by default) continues with subsequent pairs until
all have been processed.
Default processing operates as follows:
· the directory is recursively read from the source
site
· If no directory was readable from the source site,
the line `*** <site> <dirs>: <error> ***' is printed, and pro
cessing continues with the next site.
· The results are compared with the `destination di
rectory', which is by default the contents of a placeholder file,
normally called .chkls.<site><dirs>.gz, with `/''s and ` ''s re
moved from the dirs list and replaced with `.' and `_', respec
tively. Thus, the default placeholder file for the "pair" `site1
/pub/dir1 /pub/dir2' is `.chkls.site1.pub.dir1_.pub.dir2.gz'.
· If any changes occur, the line `--- <site> <dirs>
---' is printed, along with a list of changes, sorted by date,
then name. Each change is preceded by a single character indi
cating the type of change. Additions and removals are preceded
by `+' and `-', respectively. Other prefixes are documented by
the options which might generate those prefixes.
· If any changes occurred, the destination placehold
er file is updated.
· Processing then continues with the next site.

OPTIONS

The entire pair list and/or each pair can be preceded by a
list of options. Actually, any options that precede the -+ op
tion (explicit or implied) will apply to the given pair.
Note: argument-less options take the `+' prefix to mean
the opposite of the normal meaning. Other options can also take
`+', but the meaning doesn't change. The -+ (++) -h (+h), and -
(+-) also don't change meaning.
Percent Substitution
There are two types of %-substitution: outside and in
side. Outside substitution is for site-wide items, such as file
names and report headers. Inside substitution is for file-spe
cific items, such as mirror pipes and report lines.
The following outside-substitutions are done:

%s Site name
%d Underline-separated list of directories,
substituting `/' with `.' and ` ' with `_'.
%p Space-separated list of directories
%S Alternate site
%D Alternate %d-style list of directories
%P Alternate %p-style list of directories
%t Extra text; for headers/footers this is "to
tals" and for error messages this is the actual error message.
Otherwise this is an empty string.
%% The `%' character
The following inside-substitutions are done:

%f File name without path
%L The link name (if appropriate)
%r Full directory path to remote file (without
file name; with trailing `/')
%l Full directory path to local file (without
file name; with trailing `/')
%s Site name
%b Size (in bytes)
%m Mode (full mode, including file type)
%d Modification date (YYYYMMDD)
%t Modification time (HHMM)
%{<format>}
Modification date passed with <format> to
%D The device major number (device nodes only)
%M The device minor number (device nodes only)
%T The type of operation ('+', '-', etc.)
%[conditional_text]
Add %-substituted text conditionally. The
format of the conditional text is: <condition> [ ? <true_text> ]
[ : <false_text> ]. Note that either the ? or the : or both
must be present. The <condition> is evaluated, and either the
<true_text> or the <false_text> is evaluated and inserted as ap
propriate. The following conditions are available:
B[=|>|<][size]
Check byte-size of file. If no di
rectional specifier is given, then > is implied. If no size is
given, then 0 is implied. Size may be specifed as number of
bytes, number of Kilobytes, number of Blocks, or number of

M

tered suffix.
l True if directory entry is a soft
link.
f True if directory entry is a regular
file.
d True if directory entry is a direc
tory.
s True if directory entry is a socket.
b True if directory entry is a block
special file.
c True if directory entry is a charac
ter-special file.
p True if directory entry is named
pipe.
t True if sticky bit is set for direc
tory entry.
[ugo]rwxS
Check permissions as specified by
given pattern. S stands for setuid/setgid. Permissions are and
ed with given mask (if no ugo given, then all are implied) and
true is returned if any bit is still set.
T<type>
True if type of operation is equal
to given <type> character.
%% The `%' character
Generic Directory Specification
All FTP sites may be specified as follows: [ [ <user> ] [
,<acct> ] [ :<pass> ] @ ] [ <host> ] [ ,<port> ] [ :<dir> ]. The
-o and -O options take generic directory specs as arguments.
These are as follows:
l<ls-lR file>
This specifies a local file with a format parsable
by the listing parser. The file name is processed by outside
%-substitution.
c<command>
This specifies a command to run which will generate
output parsable by the listing parser. This is usally an ls(1)
or a find(1) command. The command is processed by outside %-sub
stitution.
m<lsfile>
This specifies a lurkftp-generated placemarker
file. The file name is processed by outside %-substitution.
d<localdir>
This specifies a local directory to recursively
read for directory entries. Multiple directories may be speci
fied.
f<ftpsite>
This specifies a site+dir to recursively read for
directory entries. Multiple directories may be specified.
L<ftpsite>
This specifies a remote file in a format parsable
by the listing parser to retrieve and use for the directory.
General Options
-B Run in background: close stdin/stdout/stderr,
fork, and dissociate from parent process group. Lurkftp should
return immediately to the invoking process.
-F <filename>
Read an option file (immediately). In option
files, blank lines and anything on a line after a `#' are ig
nored. An implicit `-+' option (i.e. site/dir pair separator) is
generated at the end of any line containing a site and/or direc
tory name. Quotes (`'' and `"'), the ´ character, and the `~'
character in option files are handled as per csh(1). Environment
variables ($<name> or ${<name>}) are also expanded when not es
caped by single quotes or backslash.
-P Process in parallel by calling fork(2) before pro
cessing each site.
-N Indicate that subsequent operations depend on their
predecessor. That is, forks will not separate these operations,
and failure in one operation will terminate all subsequent depen
dent operations. There may be multiple dependency groups.
-z <prog>
Program to filter all ls files through when writing
(default: gzip). Setting this to an empty string disables output
filtering.
-Z <prog>
Program to filter ls files or remote listings
through if the first character of the file in question is non
printable as per ANSI isprint(3). (default: gunzip). Setting
this to an empty string resets to the default.
-v <mask>
Set debug mask to <masks>. Masks greater than 0
will produce some lurkftp trace messages.
-- Next argument is literal. Note that this differs
from getopt(3) in that it only literalizes the next option, not
all remaining options.
-+ Separate multiple site/dir groups.
-h Print help message and exit.
Reporting Options
-q Suppress change report
-R <command>
If a report is generated, then pipe that report to
the given command. Otherwise don't invoke the command.
-r <type><string>
This option sets various report-related strings.
Type Function
t Sets the report's title string. Outside
%-substitution is performed on this string. The default is `--
%s %d ---'.
d Sets the report's directory-entry line. In
side %-substitution is performed on this string. The default is
`%T %d %12b %r%f%[l? -> %L]' if mirroring is turned on, and is
the same, but surrounded by the conditional `%[T<T>: ... ]' when
mirroring is disabled so that moves are not reported.
f Sets the report's footer string. Outside
%-substitution is performed on this string. The default is `%t'.
s Sets the report's sort string. The sort
string is at most 8 comparison specifiers, and sorting is ordered
by performing each comparison in the order of the string until a
mismatch is found. The default is `fdpnlst'. The following com
parison specifiers can be used, as well as the reverse-order ver
sion (which is the same letter, but capitalized):

f Sort numerically by file type.
m Sort numerically by mode (other than
file type).
p Sort alphabetically by file's path.
n Sort alphabetically by file's name.
l Sort alphabetically by link name, if
present.
d Sort by date (ymd) if entry is a
file.
t Sort by time (hm) if entry is a
file.
s Sort by size (in bytes) if entry is
a file.
T Sets the error report's title string. Out
side %-substitution is performed on this string. The default is
`0** ERRORS IN %S %P -> %s %p MIRROR ***'.
D Sets the error report's directory-entry
line. Inside %-substitution is performed on this string.
F Sets the error report's footer string. Out
side %-substitution is performed on this string.
S Sets the error report's sort string. The
sort string is in the same format as that used by the standard
report. The default is `PNLFDST'.
e Sets the format for general error messages.
Outside %-substitution is perfomed on this string.
Site/Directory Specification Options
-o <dirspec>
Set generic source directory.
-O <dirspec>
Set generic destination directory.
-p <password>
Set default FTP login password (default:
<myusername>@)
-b <base>
Change default name (formerly just base name) for
placeholder files (default: .chkls.%s%d.gz). Outside %-substitu
tion is performed on the name.
-L <rname>
Use file <rname> on remote site instead of perform
ing remote directory listing. Note: this option overrides the
-f option below. This option only affects the next site/dir
pair.
-U Detect unchanged (i.e. moved) files. If two regu
lar files have the same date, size, and name, but are located in
different directories, then they are processed as moved. When no
mirror directory or pipe is defined, moved files are not report
ed; otherwise they are reported with `<' and `>' for the original
and new location, respectively, and the file is not retrieved
from the remote site, but either ignored (if pipes are enabled)
or moved as if by mv(1) if mirroring to a directory is enabled.
An error in moving will be reported by the characters `(' and
`)'.
-M Force "manual" recursion when retrieving remote
listing by using LIST -la or LIST (depending on which works) on
each directory and issuing a CWD command to enter subdirectories.
This mode is invoked automatically if the default LIST -lRa com
mand fails for any reason (usually because the -lRa options
aren't supported by the remote FTP daemon). This is especially
useful if specific directories are to be filtered out, as the re
cursion routine will match the name of the directory to be en
tered (with a trailing `/') against the exclude filter before re
cursing.
Mirroring Options
-m Perform mirroring when applicable; requires -d
and/or -e and/or -t options. If this option is turned off, re
ports are still made, so this option can be used to test what the
results of a mirroring operation would be. Beware: List files
are also updated, however, so some pseudo-directory tricks to
mirror-pipe specific files will pretend complete success. (e.g.
the sunsite .lsm trick used in the example can't be harmlessly
tested before running). Any failure to download a file (and, in
the case of the -e option, complete the pipe successfully) will
be reported by an entry preceded by the `*' character, and any
failure to remove a file will be reported by an entry preceded by
the `#' character.
-d <ldir>
Set local directory to <ldir> and read it instead
of a placeholder file. This option only applies to the next
site/dir pair.
-e <cmd>
Don't update the local directory when mirroring;
instead pipe each new file into <cmd>. This option only applies
to the next site/dir pair. It would probably also be useful to
use the -l and -f options with this. The local directory (-d) is
only needed if the %l %-escape is used. Inside %-substitution is
perfomed on <cmd>.
-l <file>
Read and update placeholder <file> instead of using
contents of local directory. This option only affects the next
site/dir pair. The same %-substitutions are done as for the -b
option.
-f <file>
Read placeholder <file> instead of retrieving re
mote directory. This option only affects the next site/dir pair.
The same %-substitutions are done as for the -b option. This op
tion overrides the -L option above.
-E Make "exact" comparison: fix modes to match remote
site. The report shows changes which merely change modes by pre
ceding them with a `M'. Failure to perform the change will be
reported by preceding the entry with `$'.
-n Make no file transfers or moves, or deletions; just
update date stamps [and modes if the -E option is active].
-A Attempt to append to files which increase in size
instead of downloading the entire file. This is useful in cases
where a directory of log files which always increase in size is
to be mirrored.
-t <site>
Mirror source files to remote directory.
-c Force source files to be from local directory.
-g <pipe>
Get source files by executing <pipe>. Inside
%-substitution is done on <pipe>.
Filtering Options
Note: Only one include filter and/or one exclude filter
can be specified. The include filter is run first, and then the
exclude filter. Passing the null string to the -i or -x options
removes the associated filter.
-i <regex>
Include only files that match the extended regular
expression <regex>.
-I <file>
Include only files that match the extended regular
expression contained in <file>. Newlines in <file> are converted
to `|'.
-x <regex>
Exclude any files that match the extended regular
expression <regex>.
-X <file>
Exclude any files that match the extended regular
expression contained in <file>. Newlines in <file> are converted
to `|'.
-D Filter out directories. Note that in order to han
dle automatic directory processing properly, mirrors that use -f
to read placeholder files that were generated with this option
should also have this option in effect.
-s Don't filter out specials (device nodes, pipes, and
sockets). Normally they are filtered out. Note that when mir
roring device nodes and pipes are created, but sockets aren't.
Timeout Options
Note that all timeout options use the same base option,
-T. All timeout options can be specified with the same parameter
string by concatenating desired timeouts. Also, any timeout set
to zero is disabled completely.
-T c<seconds>
Initial connection and login timeout (default: 20)
-T t<seconds per K>
Timeout for file and directory transfers (default:
10)
-T o<seconds>
Timeout for simple commands (cd, pwd, etc.)
-T q<seconds>
Timeout for quit command and logout (default: 5)
-T r<count>
# of times to retry list and/or file retrievals be
fore giving up (default: 10)
-T d<seconds>
Amount of time to wait between retries (default:
10)

EXAMPLES

Command lines
# Look for new versions of X for Linux & mail report to me
lurkftp -i Linux ftp.xfree86.org /pub/XFree86 -F .mailme
# .mailme is a file containing: -R 'mail -s "lurkftp out
put" dark'
# Mirror a single directory with reschedule;
# at will mail me the report.
atcron "2:00 tomorrow" lurkftp -m -d /net/ftp/rplay
ftp.sdsu.edu /pub/rplay
# Mirror slackware disk set via sz into /usr/local/sw
# Not recommended if no auto-download in local comm pro
gram
lurkftp -d /usr/local/sw -l .sw.gz -e "ONAME=%l%f sz -"
ftp.cdrom.com /pub/linux/slackware/slakware -F .mailme
# Do main lurking; see config files below
lurkftp -F .chksites
Contents of .chksites
# An extract from my command file
# no multiple entries from same site, so simplify name
-b .chkls.%s.gz
-R 'mail -s "LurkFTP Output" dark' # mail reports to me
-D # Don't care about changes in directories
-U # ignore moves
-P # fork away!
-X .chkfilt.sunsite # special filter for sunsite
sunsite.unc.edu /pub/Linux # fetch master list
-N # .lsm stuff depends on sunsite
# mail new .lsm's to me
-i '.*.lsm$' -x /Incoming/ # include lsm's not in Incoming
dir
-f .chkls.%s.gz # Read remote site from previously gener
ated listing
#Note: the following file was primed so that old .lsms
wouldn't
#be sent. This was done by *not* using -m. It could've
also
# been primed by using the command:
# zgrep .lsm .chkls.sunsite.gz | gzip >.chkls.lsms.gz
-l .chkls.lsms.gz # Keep track of sent .lsm's in this file
-m -e 'mail -s "lurkftp: %f" dark' # mirror through pipe
sunsite.unc.edu /pub/Linux # same site/dir as above
-i "" # reset include filter
+N # No more dependencies
-X .chkfilt # filter for everyone else
tsx-11.mit.edu /pub/linux/680x0 /pub/linux/packages/GCC
ftp.kernel.org /pub/linux/kernel
# etc.
Contents of .chkfilt
INDEX.whole
INDEX.short
ls-lR
/INDEX(|.html|.old)$
00-find-ls(|.gz)$
Contents of .chkfilt.sunsite
/README$
/distributions/
/!INDEX
/archive-current/
linux-announce.archive
INDEX.whole
INDEX.short
ls-lR
/INDEX(|.html|.old)$
00-find-ls(|.gz)$

SEE ALSO

regexec(3), gzip(1), ftp(1), mail(1), at(1), mirror(1L).

BUGS

[+: may want to fix; *: definitely want to fix; -: may
never fix]
* Doesn't handle non-UNIX remote sites [I know of none any
more]
+ Some fixed-sized buffers may overflow
- Groups & user names aren't mirrored
- Sockets aren't mirrored
- Exact time isn't used for comparison (only accurate to
what ls gives)
- All options in external program option group are obso
lete
+ Few options are really range-checked
* Probably plenty of nasty hidden bugs

DIAGNOSTICS

Failed transfers are marked in the report. Specific er
rors are printed to stderr. Debugging messages and some error
messages are only printed when the debug level (as set by the -v
option) is greater than 0.

AUTHOR

Thomas J. Moore, dark@mama.indstate.edu
May 22, 1997
Copyright © 2010-2025 Platon Technologies, s.r.o.           Home | Man pages | tLDP | Documents | Utilities | About
Design by styleshout