ubh(1)
NAME
ubh - Download and decode Usenet binaries.
SYNOPSIS
ubh [switches]
DESCRIPTION
ubh is the Usenet Binary Harvester, a Perl program which automatically
discovers, downloads, and decodes single and multi-part binary Usenet
postings.
ubh decodes single and multi-part uuencoded binaries.
ubh also decodes single part MIME base64-encoded image, audio, and
video attachments, and application/octet-stream attachments. It also
combines and decodes multi-part message/partial binaries.
You can specify search filters to select articles to download via Perl
regular expression syntax.
ubh uses a standard .newsrc file to control which groups to process,
and uses the .newsrc to keep track of articles already processed. ubh
caches downloaded subject headers in the .ubhcache file.
ubh automatically eliminates crossposted articles by marking them as
read in your .newsrc.
ubh can connect to multiple servers, combining the available single and
multi-part subjects from those servers, and can assemble multi-part
binaries from parts which are spread across those servers.
By default ubh will only consider articles with a well-formed Subject
header. A well-formed Subject header is one that contains a file name
with an extension which matches the extension filter. Multi-part Subject headers will be recognized if they contain a part/total designator
of the form [m/n] or (m/n), where m is the part number and n is the
total number of parts. Part numbers begin with 1 and may or may not
contain leading zeroes. A part number of 0 is ignored.
ubh provides an interactive article preselection option to allow you to
preview the Subject: headers for multi-part binaries, and specify which
binaries you wish to download.
ubh runs equally well under Unix-based Perl, Active Perl on Win32 platforms, and Mac OS X.
OPTIONS
These options apply to single-part article processing:
-S Process only single part articles.
- -g This option ("g" for "greedy") will download and process each
- unread article even if the subject does not contain a filename which matches the single part extension filter.
- These options apply to multi-part article processing:
- -M Process only multiple part articles.
- -i Interactive preselection of multipart articles.
- These options apply to both single and multi-part processing:
- -A Enables disk-based article assembly. This will download the
- articles to disk (instead of RAM) prior to decoding.
- -bfile Batch file processing. file should contain a list of article
- subjects (from the .log files; see -s), one per line, to be downloaded. Mutually exclusive with -I and -X.
- -C Cleans up filenames. Without -C, ubh will perform certain manda
- tory character substitutions. This option performs further cleanup of the file names by replacing all non-alphanumeric characters in a filename with the EVILCHAR. Kills all nonalphanumeric leading characters. This will eliminate spaces in filnames, as well as all other undesireable (and possibly illegal) characters.
- -d Diagnostic mode. Downloads and writes all unread articles in raw
- form. This occurs prior to single and multi-part filtering. It's very useful to look at the raw articles to see why they are failing to be selected for downloading. Helpful for reverseengineering new or bizarre encoding formats.
- You can also use this to perform your own post-processing directly on the raw articles.
- Articles are output using the article ID as the file name with the extension .dump.
- -D Dump mode. Downloads and writes all selected single and complete
- multi-part article to disk, instead of decoding.
- Very useful to look at the raw articles to see why they are failing to be unencoded. Helpful for reverse-engineering new or bizarre encoding formats.
- You can also use this to perform your own post-processing directly on the raw articles.
- -cfile Use file as configuration file, instead of the default.
On Win32 platforms, the default is ubhrc. On Unix platforms, the default is ${HOME}/.ubhrc.- -Ggroup
- By default ubh will process every subscribed group in the .newsrc. This option specifies the name of one group to be processed.
- Note that group must exist in the .newsrc and must be subscribed.
- -a Process all articles, but disregard the newsrc (ie, consider all
- articles even if they are marked as read in the newsrc, and do not catch up the group at the end of processing of the group).
- -fnum Process the first num articles. Updates newsrc.
- -lnum Process the last num articles. Updates newsrc.
- -s Log all subjects to subjects.log. Log multi-part subjects to
- multiparts.log. Doesn't download anything. Disregards newsrc. If FORCEDIR is in effect, the names of the file will be prepended with the group name.
- -Iregexp
- Inclusion search filter (double quote on command line). regexp is any valid Perl regular expression
- -Xregexp
- Exclusion search filter (double quote on command line). regexp is any valid Perl regular expression.
- -L Long filenames - uses the article subject as the filename. This
- makes life easier because many folks encode their files with terribly vague filenames.
- -n Updates the .newsrc every time an article is processed, instead
- of waiting until the entire group has been processed.
- -Oopt Tells ubh what to do when it downloads a file and a file by that
- name already exists (default no).
- no tells ubh to create a unique filename (by prepending the article number to the filename).
- yes tells ubh to overwrite the existing file with the new file.
- skip tells ubh to skip the incoming file and keep the existing file. In the case of multipart uuencoded binaries, ubh will download the first part to determine the file name; if a file with the same name already exists, ubh will skip the rest of the parts for that binary.
- -u Prints out a brief usage summary.
- -w Prints out warranty information.
- -y chmod 0666 on all output files.
- -Z Produces lots and lots of logs.
- -z Marks articles that don't pass inclusion/exclusion as read.
- This cleans up the .newsrc dramatically.
SEE ALSO
More extensive (and complete) documentation can be found in /usr/share/doc/ubh/doc/ubh.html.
COPYRIGHT
Copyright © 2000, 2001 Gerard Lanois
- This software and manpage are released under the GNU General Public
License.