lurkftp(1)
NAME
lurkftp v1.00 - monitor and/or mirror FTP sites
SYNOPSIS
lurkftp [options] [site [dir] .. ] ..
DESCRIPTION
- Lurkftp is the ultimate FTP site lurker and mirroring pro
- gram. It will monitor changes in source directories and either
- just report changes or mirror changes into a destination directo
- ry.
- Lurkftp in its most basic mode takes site/directory-list
- "pairs", as follows:
- site1 /pub/dir1
site2 /pub/dir2a /pub/dir2b
site3 dir3 - These pairs are either separated by newlines (if in an op
- tion file), by the -+ option (e.g. "lurkftp site1 /pub/dir1 -+
- site2 /pub/dir2"), or by the EOF of an option file.
- Once all options are parsed, processing begins with the
- first pair and (by default) continues with subsequent pairs until
- all have been processed.
- Default processing operates as follows:
- · the directory is recursively read from the source
- site
- · If no directory was readable from the source site,
- the line `*** <site> <dirs>: <error> ***' is printed, and pro
- cessing continues with the next site.
- · The results are compared with the `destination di
- rectory', which is by default the contents of a placeholder file,
- normally called .chkls.<site><dirs>.gz, with `/''s and ` ''s re
- moved from the dirs list and replaced with `.' and `_', respec
- tively. Thus, the default placeholder file for the "pair" `site1
- /pub/dir1 /pub/dir2' is `.chkls.site1.pub.dir1_.pub.dir2.gz'.
- · If any changes occur, the line `--- <site> <dirs>
- ---' is printed, along with a list of changes, sorted by date,
- then name. Each change is preceded by a single character indi
- cating the type of change. Additions and removals are preceded
- by `+' and `-', respectively. Other prefixes are documented by
- the options which might generate those prefixes.
- · If any changes occurred, the destination placehold
- er file is updated.
- · Processing then continues with the next site.
OPTIONS
- The entire pair list and/or each pair can be preceded by a
- list of options. Actually, any options that precede the -+ op
- tion (explicit or implied) will apply to the given pair.
Note: argument-less options take the `+' prefix to mean - the opposite of the normal meaning. Other options can also take
- `+', but the meaning doesn't change. The -+ (++) -h (+h), and -
- (+-) also don't change meaning.
- Percent Substitution
- There are two types of %-substitution: outside and in
- side. Outside substitution is for site-wide items, such as file
- names and report headers. Inside substitution is for file-spe
- cific items, such as mirror pipes and report lines.
- The following outside-substitutions are done:
%s Site name%d Underline-separated list of directories,substituting `/' with `.' and ` ' with `_'.%p Space-separated list of directories%S Alternate site%D Alternate %d-style list of directories%P Alternate %p-style list of directories%t Extra text; for headers/footers this is "totals" and for error messages this is the actual error message.Otherwise this is an empty string.%% The `%' character - The following inside-substitutions are done:
%f File name without path%L The link name (if appropriate)%r Full directory path to remote file (withoutfile name; with trailing `/')%l Full directory path to local file (withoutfile name; with trailing `/')%s Site name%b Size (in bytes)%m Mode (full mode, including file type)%d Modification date (YYYYMMDD)%t Modification time (HHMM)%{<format>}Modification date passed with <format> to%D The device major number (device nodes only)%M The device minor number (device nodes only)%T The type of operation ('+', '-', etc.)%[conditional_text]Add %-substituted text conditionally. Theformat of the conditional text is: <condition> [ ? <true_text> ][ : <false_text> ]. Note that either the ? or the : or bothmust be present. The <condition> is evaluated, and either the<true_text> or the <false_text> is evaluated and inserted as appropriate. The following conditions are available:B[=|>|<][size]Check byte-size of file. If no directional specifier is given, then > is implied. If no size isgiven, then 0 is implied. Size may be specifed as number ofbytes, number of Kilobytes, number of Blocks, or number of
M
- tered suffix.
l True if directory entry is a soft - link.
- f True if directory entry is a regular
- file.
- d True if directory entry is a direc
- tory.
- s True if directory entry is a socket.
- b True if directory entry is a block
- special file.
- c True if directory entry is a charac
- ter-special file.
- p True if directory entry is named
- pipe.
- t True if sticky bit is set for direc
- tory entry.
- [ugo]rwxS
- Check permissions as specified by
- given pattern. S stands for setuid/setgid. Permissions are and
- ed with given mask (if no ugo given, then all are implied) and
- true is returned if any bit is still set.
- T<type>
- True if type of operation is equal
- to given <type> character.
- %% The `%' character
- Generic Directory Specification
- All FTP sites may be specified as follows: [ [ <user> ] [
- ,<acct> ] [ :<pass> ] @ ] [ <host> ] [ ,<port> ] [ :<dir> ]. The
- -o and -O options take generic directory specs as arguments.
- These are as follows:
- l<ls-lR file>
This specifies a local file with a format parsableby the listing parser. The file name is processed by outside%-substitution.
- c<command>
This specifies a command to run which will generateoutput parsable by the listing parser. This is usally an ls(1)or a find(1) command. The command is processed by outside %-substitution.
- m<lsfile>
This specifies a lurkftp-generated placemarkerfile. The file name is processed by outside %-substitution.
- d<localdir>
This specifies a local directory to recursivelyread for directory entries. Multiple directories may be specified.
- f<ftpsite>
This specifies a site+dir to recursively read fordirectory entries. Multiple directories may be specified.
- L<ftpsite>
This specifies a remote file in a format parsableby the listing parser to retrieve and use for the directory.
- General Options
- -B Run in background: close stdin/stdout/stderr,
- fork, and dissociate from parent process group. Lurkftp should
- return immediately to the invoking process.
- -F <filename>
Read an option file (immediately). In optionfiles, blank lines and anything on a line after a `#' are ignored. An implicit `-+' option (i.e. site/dir pair separator) isgenerated at the end of any line containing a site and/or directory name. Quotes (`'' and `"'), the ´ character, and the `~'character in option files are handled as per csh(1). Environmentvariables ($<name> or ${<name>}) are also expanded when not escaped by single quotes or backslash.
- -P Process in parallel by calling fork(2) before pro
- cessing each site.
- -N Indicate that subsequent operations depend on their
- predecessor. That is, forks will not separate these operations,
- and failure in one operation will terminate all subsequent depen
- dent operations. There may be multiple dependency groups.
- -z <prog>
Program to filter all ls files through when writing(default: gzip). Setting this to an empty string disables outputfiltering.
- -Z <prog>
Program to filter ls files or remote listingsthrough if the first character of the file in question is nonprintable as per ANSI isprint(3). (default: gunzip). Settingthis to an empty string resets to the default.
- -v <mask>
Set debug mask to <masks>. Masks greater than 0will produce some lurkftp trace messages.
- -- Next argument is literal. Note that this differs
- from getopt(3) in that it only literalizes the next option, not
- all remaining options.
- -+ Separate multiple site/dir groups.
- -h Print help message and exit.
- Reporting Options
- -q Suppress change report
- -R <command>
If a report is generated, then pipe that report tothe given command. Otherwise don't invoke the command.
- -r <type><string>
This option sets various report-related strings.Type Functiont Sets the report's title string. Outside%-substitution is performed on this string. The default is `--%s %d ---'.d Sets the report's directory-entry line. Inside %-substitution is performed on this string. The default is`%T %d %12b %r%f%[l? -> %L]' if mirroring is turned on, and isthe same, but surrounded by the conditional `%[T<T>: ... ]' whenmirroring is disabled so that moves are not reported.f Sets the report's footer string. Outside%-substitution is performed on this string. The default is `%t'.s Sets the report's sort string. The sortstring is at most 8 comparison specifiers, and sorting is orderedby performing each comparison in the order of the string until amismatch is found. The default is `fdpnlst'. The following comparison specifiers can be used, as well as the reverse-order version (which is the same letter, but capitalized):
f Sort numerically by file type.m Sort numerically by mode (other thanfile type).p Sort alphabetically by file's path.n Sort alphabetically by file's name.l Sort alphabetically by link name, ifpresent.d Sort by date (ymd) if entry is afile.t Sort by time (hm) if entry is afile.s Sort by size (in bytes) if entry isa file. - T Sets the error report's title string. Out
- side %-substitution is performed on this string. The default is
- `0** ERRORS IN %S %P -> %s %p MIRROR ***'.
- D Sets the error report's directory-entry
- line. Inside %-substitution is performed on this string.
- F Sets the error report's footer string. Out
- side %-substitution is performed on this string.
- S Sets the error report's sort string. The
- sort string is in the same format as that used by the standard
- report. The default is `PNLFDST'.
- e Sets the format for general error messages.
- Outside %-substitution is perfomed on this string.
- Site/Directory Specification Options
- -o <dirspec>
Set generic source directory.
- -O <dirspec>
Set generic destination directory.
- -p <password>
Set default FTP login password (default:<myusername>@)
- -b <base>
Change default name (formerly just base name) forplaceholder files (default: .chkls.%s%d.gz). Outside %-substitution is performed on the name.
- -L <rname>
Use file <rname> on remote site instead of performing remote directory listing. Note: this option overrides the-f option below. This option only affects the next site/dirpair.
- -U Detect unchanged (i.e. moved) files. If two regu
- lar files have the same date, size, and name, but are located in
- different directories, then they are processed as moved. When no
- mirror directory or pipe is defined, moved files are not report
- ed; otherwise they are reported with `<' and `>' for the original
- and new location, respectively, and the file is not retrieved
- from the remote site, but either ignored (if pipes are enabled)
- or moved as if by mv(1) if mirroring to a directory is enabled.
- An error in moving will be reported by the characters `(' and
- `)'.
- -M Force "manual" recursion when retrieving remote
- listing by using LIST -la or LIST (depending on which works) on
- each directory and issuing a CWD command to enter subdirectories.
- This mode is invoked automatically if the default LIST -lRa com
- mand fails for any reason (usually because the -lRa options
- aren't supported by the remote FTP daemon). This is especially
- useful if specific directories are to be filtered out, as the re
- cursion routine will match the name of the directory to be en
- tered (with a trailing `/') against the exclude filter before re
- cursing.
- Mirroring Options
- -m Perform mirroring when applicable; requires -d
- and/or -e and/or -t options. If this option is turned off, re
- ports are still made, so this option can be used to test what the
- results of a mirroring operation would be. Beware: List files
- are also updated, however, so some pseudo-directory tricks to
- mirror-pipe specific files will pretend complete success. (e.g.
- the sunsite .lsm trick used in the example can't be harmlessly
- tested before running). Any failure to download a file (and, in
- the case of the -e option, complete the pipe successfully) will
- be reported by an entry preceded by the `*' character, and any
- failure to remove a file will be reported by an entry preceded by
- the `#' character.
- -d <ldir>
Set local directory to <ldir> and read it insteadof a placeholder file. This option only applies to the nextsite/dir pair.
- -e <cmd>
Don't update the local directory when mirroring;instead pipe each new file into <cmd>. This option only appliesto the next site/dir pair. It would probably also be useful touse the -l and -f options with this. The local directory (-d) isonly needed if the %l %-escape is used. Inside %-substitution isperfomed on <cmd>.
- -l <file>
Read and update placeholder <file> instead of usingcontents of local directory. This option only affects the nextsite/dir pair. The same %-substitutions are done as for the -boption.
- -f <file>
Read placeholder <file> instead of retrieving remote directory. This option only affects the next site/dir pair.The same %-substitutions are done as for the -b option. This option overrides the -L option above.
- -E Make "exact" comparison: fix modes to match remote
- site. The report shows changes which merely change modes by pre
- ceding them with a `M'. Failure to perform the change will be
- reported by preceding the entry with `$'.
- -n Make no file transfers or moves, or deletions; just
- update date stamps [and modes if the -E option is active].
- -A Attempt to append to files which increase in size
- instead of downloading the entire file. This is useful in cases
- where a directory of log files which always increase in size is
- to be mirrored.
- -t <site>
Mirror source files to remote directory.
- -c Force source files to be from local directory.
- -g <pipe>
Get source files by executing <pipe>. Inside%-substitution is done on <pipe>.
- Filtering Options
- Note: Only one include filter and/or one exclude filter
- can be specified. The include filter is run first, and then the
- exclude filter. Passing the null string to the -i or -x options
- removes the associated filter.
- -i <regex>
Include only files that match the extended regularexpression <regex>.
- -I <file>
Include only files that match the extended regularexpression contained in <file>. Newlines in <file> are convertedto `|'.
- -x <regex>
Exclude any files that match the extended regularexpression <regex>.
- -X <file>
Exclude any files that match the extended regularexpression contained in <file>. Newlines in <file> are convertedto `|'.
- -D Filter out directories. Note that in order to han
- dle automatic directory processing properly, mirrors that use -f
- to read placeholder files that were generated with this option
- should also have this option in effect.
- -s Don't filter out specials (device nodes, pipes, and
- sockets). Normally they are filtered out. Note that when mir
- roring device nodes and pipes are created, but sockets aren't.
- Timeout Options
- Note that all timeout options use the same base option,
- -T. All timeout options can be specified with the same parameter
- string by concatenating desired timeouts. Also, any timeout set
- to zero is disabled completely.
- -T c<seconds>
Initial connection and login timeout (default: 20)
- -T t<seconds per K>
Timeout for file and directory transfers (default:10)
- -T o<seconds>
Timeout for simple commands (cd, pwd, etc.)
- -T q<seconds>
Timeout for quit command and logout (default: 5)
- -T r<count>
# of times to retry list and/or file retrievals before giving up (default: 10)
- -T d<seconds>
Amount of time to wait between retries (default:10)
EXAMPLES
- Command lines
- # Look for new versions of X for Linux & mail report to me
lurkftp -i Linux ftp.xfree86.org /pub/XFree86 -F .mailme - # .mailme is a file containing: -R 'mail -s "lurkftp out
- put" dark'
- # Mirror a single directory with reschedule;
# at will mail me the report.
atcron "2:00 tomorrow" lurkftp -m -d /net/ftp/rplay - ftp.sdsu.edu /pub/rplay
- # Mirror slackware disk set via sz into /usr/local/sw
# Not recommended if no auto-download in local comm pro - gram
lurkftp -d /usr/local/sw -l .sw.gz -e "ONAME=%l%f sz -" - ftp.cdrom.com /pub/linux/slackware/slakware -F .mailme
- # Do main lurking; see config files below
lurkftp -F .chksites - Contents of .chksites
- # An extract from my command file
# no multiple entries from same site, so simplify name
-b .chkls.%s.gz
-R 'mail -s "LurkFTP Output" dark' # mail reports to me
-D # Don't care about changes in directories
-U # ignore moves
-P # fork away!
-X .chkfilt.sunsite # special filter for sunsite
sunsite.unc.edu /pub/Linux # fetch master list
-N # .lsm stuff depends on sunsite
# mail new .lsm's to me
-i '.*.lsm$' -x /Incoming/ # include lsm's not in Incoming - dir
-f .chkls.%s.gz # Read remote site from previously gener - ated listing
#Note: the following file was primed so that old .lsms - wouldn't
#be sent. This was done by *not* using -m. It could've - also
# been primed by using the command:
# zgrep .lsm .chkls.sunsite.gz | gzip >.chkls.lsms.gz
-l .chkls.lsms.gz # Keep track of sent .lsm's in this file
-m -e 'mail -s "lurkftp: %f" dark' # mirror through pipe
sunsite.unc.edu /pub/Linux # same site/dir as above
-i "" # reset include filter
+N # No more dependencies
-X .chkfilt # filter for everyone else
tsx-11.mit.edu /pub/linux/680x0 /pub/linux/packages/GCC
ftp.kernel.org /pub/linux/kernel
# etc. - Contents of .chkfilt
- INDEX.whole
INDEX.short
ls-lR
/INDEX(|.html|.old)$
00-find-ls(|.gz)$ - Contents of .chkfilt.sunsite
- /README$
/distributions/
/!INDEX
/archive-current/
linux-announce.archive
INDEX.whole
INDEX.short
ls-lR
/INDEX(|.html|.old)$
00-find-ls(|.gz)$
SEE ALSO
regexec(3), gzip(1), ftp(1), mail(1), at(1), mirror(1L).
BUGS
- [+: may want to fix; *: definitely want to fix; -: may
- never fix]
- * Doesn't handle non-UNIX remote sites [I know of none any
- more]
+ Some fixed-sized buffers may overflow
- Groups & user names aren't mirrored
- Sockets aren't mirrored
- Exact time isn't used for comparison (only accurate to - what ls gives)
- All options in external program option group are obso - lete
+ Few options are really range-checked
* Probably plenty of nasty hidden bugs
DIAGNOSTICS
- Failed transfers are marked in the report. Specific er
- rors are printed to stderr. Debugging messages and some error
- messages are only printed when the debug level (as set by the -v
- option) is greater than 0.
AUTHOR
- Thomas J. Moore, dark@mama.indstate.edu
- May 22, 1997