tigr-glimmer(1)
NAME
- tigr-glimmer -- Find/Score potential genes in genome-file
- using the probability model in icm-file
SYNOPSIS
tigr-glimmer2 [genome-file] [icm-file] [[options]]
DESCRIPTION
- tigr-glimmer is a system for finding genes in microbial
- DNA, especially the genomes of bacteria and archaea. tigr-glimmer
- (Gene Locator and Interpolated Markov Modeler) uses interpolated
- Markov models (IMMs) to identify the coding regions and distin
- guish them from noncoding DNA. The IMM approach, described in our
- Nucleic Acids Research paper on tigr-glimmer 1.0 and in our sub
- sequent paper on tigr-glimmer 2.0, uses a combination of Markov
- models from 1st through 8th-order, weighting each model according
- to its predictive power. tigr-glimmer 1.0 and 2.0 use 3-periodic
- nonhomogenous Markov models in their IMMs.
- tigr-glimmer is the primary microbial gene finder at TIGR,
- and has been used to annotate the complete genomes of B. burgdor
- feri (Fraser et al., Nature, Dec. 1997), T. pallidum (Fraser et
- al., Science, July 1998), T. maritima, D. radiodurans, M. tuber
- culosis, and non-TIGR projects including C. trachomatis, C. pneu
- moniae, and others. Its analyses of some of these genomes and
- others is available at the TIGR microbial database site.
- A special version of tigr-glimmer designed for small eu
- karyotes, GlimmerM, was used to find the genes in chromosome 2 of
- the malaria parasite, P. falciparum.. GlimmerM is described in
- S.L. Salzberg, M. Pertea, A.L. Delcher, M.J. Gardner, and H. Tet
- telin, "Interpolated Markov models for eukaryotic gene finding,"
- Genomics 59 (1999), 24-31. Click here (http://www.tigr.org/soft
- ware/glimmerm/) to visit the GlimmerM site, which includes infor
- mation on how to download the GlimmerM system.
- The tigr-glimmer system consists of two main programs. The
- first of these is the training program, build-imm. This program
- takes an input set of sequences and builds and outputs the IMM
- for them. These sequences can be complete genes or just partial
- orfs. For a new genome, this training data can consist of those
- genes with strong database hits as well as very long open reading
- frames that are statistically almost certain to be genes. The
- second program is glimmer, which uses this IMM to identify puta
- tive genes in an entire genome. tigr-glimmer automatically re
- solves conflicts between most overlapping genes by choosing one
- of them. It also identifies genes that are suspected to truly
- overlap, and flags these for closer inspection by the user. These
- ``suspect'' gene candidates have been a very small percentage of
- the total for all the genomes analyzed thus far. tigr-glimmer is
- a program that...
OPTIONS
- -C n Use n as GC percentage of independent model
- Note: n should be a percentage, e.g., -C 45.2
- -f Use ribosome-binding energy to choose start
- codon
- +f Use first codon in orf as start codon
- -g n Set minimum gene length to n
- -i filename
- Use filename to select regions of bases that
- are off limits, so that no bases within that area will be
- examined
- -l Assume linear rather than circular genome, i.e.,
- no wraparound
- -L filename
- Use filename to specify a list of orfs that
- should be scored separately, with no overlap rules
- -M Input is a multifasta file of separate genes to
- be scored separately, with no overlap rules
- -o n Set minimum overlap length to n. Overlaps
- shorter than this are ignored.
- -p n Set minimum overlap percentage to n%. Overlaps
- shorter than this percentage of *both* strings are ignored.
- -q n Set the maximum length orf that can be rejected
- because of the independent probability score column to (n - 1)
- -r Don't use independent probability score column
- +r Use independent probability score column
- -r Don't use independent probability score column
- -s s Use string s as the ribosome binding pattern to
- find start codons.
- +S Do use stricter independent intergenic model
- that doesn't give probabilities to in-frame stop codons. (Option
- is obsolete since this is now the only behaviour
- -t n Set threshold score for calling as gene to n.
- If the in-frame score >= n, then the region is given a number and
- considered a potential gene.
- -w n Use "weak" scores on tentative genes n or
- longer. Weak scores ignore the independent probability score.
SEE ALSO
- tigr-adjust (1), tigr-anomaly (1), tigr-build-icm (1),
- tigr-check (1), tigr-codon-usage (1), tigr-compare-lists (1),
- tigr-extract (1), tigr-generate (1), tigr-get-len (1), tigr-get
- putative (1), tigr-glimmer2 (1), tigr-long-orfs (1)
- http://www.tigr.org/software/glimmer/
- Please see the readme in /usr/share/doc/glimmer for a de
- scription on how to use Glimmer.
AUTHOR
- This manual page was quickly copied from the glimmer web
- site by Steffen Moeller moeller@pzr.uni-rostock.de for the Debian
- system.
TIGR