clustalw(1)
NAME
clustalw - Multiple alignment of nucleic acid and protein sequences
SYNOPSIS
clustalw [-infile] file.ext [OPTIONS]
clustalw [-help | -fullhelp]
DESCRIPTION
Clustal W is a general purpose multiple alignment program for DNA or
proteins.
The program performs simultaneous alignment of many nucleotide or amino
acid sequences. It is typically run interactively, providing a menu and
an online help. If you prefer to use it in command-line (batch) mode,
you will have to give several options, the minimum being -infile.
OPTIONS
- DATA (sequences)
- -infile=file.ext
Input sequences.
- -profile1=file.ext and -profile2=file.ext
Profiles (old alignment)
- VERBS (do things)
- -options
List the command line parameters.
- -help or -check
Outline the command line params.
- -fullhelp
Output full help content.
- -align
Do full multiple alignment.
- -tree
Calculate NJ tree.
- -bootstrap=n
Bootstrap a NJ tree (n= number of bootstraps; def. = 1000).
- -convert
Output the input sequences in a different file format.
- PARAMETERS (set things)
- General settings:
-interactiveRead command line, then enter normal interactive menus.-quicktreeUse FAST algorithm for the alignment guide tree.-type=PROTEIN or DNA sequences.-negativeProtein alignment with negative values in matrix.-outfile=Sequence alignment file name.-output=GCG, GDE, PHYLIP, PIR or NEXUS.-outputorder=INPUT or ALIGNED-caseLOWER or UPPER (for GDE output only).-seqnos=OFF or ON (for Clustal output only).-seqnos_range=OFF or ON (NEW: for all output formats).-range=m,nSequence range to write starting m to m+n.-maxseqlen=nMaximum allowed input sequence length.-quietReduce console output to minimum.-stats=fileLog some alignments statistics to file.
- Fast Pairwise Alignments:
- -ktuple=n
Word size.
- -topdiags=n
Number of best diags.
- -window=n
Window around best diags.
- -pairgap=n
Gap penalty.
- -score
PERCENT or ABSOLUTE.
- Slow Pairwise Alignments:
- -pwmatrix=
:Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename
- -pwdnamatrix=
DNA weight matrix=BLOSUMIUB, BLOSUMCLUSTALW or BLOSUMfilename.
- -pwgapopen=f
Gap opening penalty.
- -pwgapext=f
Gap extension penalty.
- Multiple Alignments:
- -newtree=
File for new guide tree.
- -usetree=
File for old guide tree.
- -matrix=
Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename.
- -dnamatrix=
DNA weight matrix=IUB, CLUSTALW or filename.
- -gapopen=f
Gap opening penalty.
- -gapext=f
Gap extension penalty.
- -engaps
No end gap separation pen.
- -gapdist=n
Gap separation pen. range.
- -nogap
Residue-specific gaps off.
- -nohgap
Hydrophilic gaps off.
- -hgapresidues=
List hydrophilic res.
- -maxdiv=n
Percent identity for delay.
- -type=
PROTEIN or DNA
- -transweight=f
Transitions weighting.
- -iteration=
NONE or TREE or ALIGNMENT.
- -numiter=n
Maximum number of iterations to perform.
- Profile Alignments:
- -profile
Merge two alignments by profile alignment.
- -newtree1=
File for new guide tree for profile1.
- -newtree2=
File for new guide tree for profile2.
- -usetree1=
File for old guide tree for profile1.
- -usetree2=
File for old guide tree for profile2.
- Sequence to Profile Alignments:
- -sequences
Sequentially add profile2 sequences to profile1 alignment.
- -newtree=
File for new guide tree.
- -usetree=
File for old guide tree.
- Structure Alignments:
- -nosecstr1
Do not use secondary structure-gap penalty mask for profile 1.
- -nosecstr2
Do not use secondary structure-gap penalty mask for profile 2.
- -secstrout=STRUCTURE or MASK or BOTH or NONE
Output in alignment file.
- -helixgap=n
Gap penalty for helix core residues.
- -strandgap=n
Gap penalty for strand core residues.
- loopgap=n
Gap penalty for loop regions.
- -terminalgap=n
Gap penalty for structure termini.
- -helixendin=n
Number of residues inside helix to be treated as terminal.
- -helixendout=n
Number of residues outside helix to be treated as terminal.
- -strandendin=n
Number of residues inside strand to be treated as terminal.
- -strandendout=n
Number of residues outside strand to be treated as terminal.
- Trees:
- -outputtree=nj OR phylip OR dist OR nexus
- -seed=n
Seed number for bootstraps.
- -kimura
Use Kimura's correction.
- -tossgaps
Ignore positions with gaps.
- -bootlabels=node
Position of bootstrap values in tree display.
- -clustering=
NJ or UPGMA.
BUGS
The Clustal bug tracking system can be found at
http://bioinf.ucd.ie/bugzilla/buglist.cgi?quicksearch=clustal.
SEE ALSO
REFERENCES
- o Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA,
- McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. (2007). Clustal W and Clustal X version 2.0.[1] Bioinformatics, 23, 2947-2948.
- o Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG,
- Thompson JD. (2003). Multiple sequence alignment with the Clustal series of programs.[2] Nucleic Acids Res., 31, 3497-3500.
- o Jeanmougin F, Thompson JD, Gouy M, Higgins DG, Gibson TJ. (1998).
- Multiple sequence alignment with Clustal X[3]. Trends Biochem Sci., 23, 403-405.
- o Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG.
- (1997). The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools.[4] Nucleic Acids Res., 25, 4876-4882.
- o Higgins DG, Thompson JD, Gibson TJ. (1996). Using CLUSTAL for
- multiple sequence alignments.[5] Methods Enzymol., 266, 383-402.
- o Thompson JD, Higgins DG, Gibson TJ. (1994). CLUSTAL W: improving
- the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.[6] Nucleic Acids Res., 22, 4673-4680.
- o Higgins DG. (1994). CLUSTAL V: multiple alignment of DNA and
- protein sequences.[7] Methods Mol Biol., 25, 307-318
- o Higgins DG, Bleasby AJ, Fuchs R. (1992). CLUSTAL V: improved
- software for multiple sequence alignment.[8] Comput. Appl. Biosci., 8, 189-191.
- o Higgins,D.G. and Sharp,P.M. (1989). Fast and sensitive multiple
- sequence alignments on a microcomputer.[9] Comput. Appl. Biosci., 5, 151-153.
- o Higgins,D.G. and Sharp,P.M. (1988). CLUSTAL: a package for
- performing multiple sequence alignment on a microcomputer.[10] Gene, 73, 237-244.
AUTHORS
- Des Higgins
- Copyright holder for Clustal.
- Julie Thompson
- Copyright holder for Clustal.
- Toby Gibson
- Copyright holder for Clustal.
- Charles Plessy <plessy@debian.org>
- Prepared this manpage in DocBook XML for the Debian distribution.
COPYRIGHT
Copyright (C) 1988-2008 Des Higgins, Julie Thompson & Toby Giboson
(Clustal)
Copyright (C) 2008 Charles Plessy (This manpage)
The binaries and source code are made available and can be distributed
subject to the following conditions:
- o Users are free to redistribute Clustal W or Clustal X in it's
- unmodified form as long as it is not for commercial gain.
- o Anyone wishing to redistribute Clustal commercially should contact
- Toby Gibson at <gibson@embl.de>
- o If users make changes/have ideas that they believe would be useful
- to the broader research community they can send their suggestions
to the clustal development team at <clustalw@ucd.ie> where they
will be considered for inclusion in future releases. - This manual page and its XML source can be used, modified, and
redistributed as if it were in public domain.
NOTES
- 1. Clustal W and Clustal X version 2.0.
- http://www.ncbi.nlm.nih.gov/pubmed/17846036
- 2. Multiple sequence alignment with the Clustal series of programs.
http://www.ncbi.nlm.nih.gov/pubmed/12824352 - 3. Multiple sequence alignment with Clustal X
http://www.ncbi.nlm.nih.gov/pubmed/9810230 - 4. The CLUSTAL_X windows interface: flexible strategies for multiple
sequence alignment aided by quality analysis tools.
http://www.ncbi.nlm.nih.gov/pubmed/9396791 - 5. Using CLUSTAL for multiple sequence alignments.
http://www.ncbi.nlm.nih.gov/pubmed/8743695 - 6. CLUSTAL W: improving the sensitivity of progressive multiple
sequence alignment through sequence weighting, position-specific
gap penalties and weight matrix choice.
http://www.ncbi.nlm.nih.gov/pubmed/7984417 - 7. CLUSTAL V: multiple alignment of DNA and protein sequences.
http://www.ncbi.nlm.nih.gov/pubmed/8004173 - 8. CLUSTAL V: improved software for multiple sequence alignment.
http://www.ncbi.nlm.nih.gov/pubmed/1591615 - 9. Fast and sensitive multiple sequence alignments on a microcomputer. http://www.ncbi.nlm.nih.gov/pubmed/2720464
- 10. CLUSTAL: a package for performing multiple sequence alignment on a
microcomputer.
http://www.ncbi.nlm.nih.gov/pubmed/3243435