WZIP(1)
NAME
wzip - lossy data compression and denoising
SYNOPSIS
wzip [ -c | -d | -dn | -hdn ] num sf
DESCRIPTION
This manual page documents the wzip command.
wzip is a program that can be used for LOSSY data compression and
denoising. It reads from STDIN and writes to STDOUT. In compression
mode the input is a sequence of ascii floating-point values. num is
the number of these data values. The output is a sequence of small
integers, most of them zero in typical application. This is ready for
effective compression with a standard loss-less compression program
like gzip.
The program can also be used for denoising. In this case both input and
output are sequences of ascii floating-point values.
The scale factor sf determines the strength of compression or denoising. A higher scale factor means heavier compression and stronger
denoising. Four times the standard deviation of the noise content is a
good start. Otherwise 5 percent of the overall signal amplitude might
be used as a first estimation of a suitable scale factor.
If the noise content of the input data is strongly non-Gaussian-distributed, like Poisson noise. The input data should be transformed to
approximate Gaussian-distributed noise. If the input values are Poisson-distributed, that means for example raw counts per channel in EDX
or XPD, they can be transformed to approximate Gaussian-distributed
noise by transformation of each data point with y:=2.0*sqrt(x+0.25109).
Back transformation is done with y:=(x/2)^2. The summand 0.25109 compensates for the bias caused by the asymmetry of the Poisson-distribution.
Invoking the program without any options writes examples of the use of
the program to STDERR.
OPTIONS
There must be given exactly one option.
- -c Compression, reads num ascii floating-point values from STDIN
- and writes a sequence of integers with high redundancy to STDOUT.
- -d Decompression, reads from STDIN and writes a sequence of num
- ascii floating-point values to STDOUT. These are more or less similar to the original data.
- -dn Denoising, reads num ascii floating-point values from STDIN and
- writes a sequence of num ascii floating-point values to STDOUT. These are more or less similar to the original data.
- -hdn Denoising with hard thresholding instead of wavelet shrinkage.
- Single untouched noise peaks may be visible with this mode. On the other hand, there is much less impact on the signal slope.
SEE ALSO
Donoho, D.L.; Johnstone, I.M.: Adapting to unknown smoothness via wavelet shrinkage, technical report 425, Department of Statistics, Stanford
University, Stanford, June 1993, ftp://playfair.stanford.edu/pub/donoho/ausws.ps.Z
Franzen, A.: Compression of process data with a wavelet method, steel
res. 69 (1998), No. 1, pp. 28/30
Franzen, A.: Non-linear denoising with wavelet transformation, Z. Metallkd. 89 (1998), No. 4, pp. 297/302
AUTHOR
This manual page was written by Andreas Franzen <anfra@debian.org>, for
the Debian GNU/Linux system (but may be used by others).
- Copyright (C) 1997 Andreas Franzen, placed under the GNU General Public
License, see the file copyright for details.