Software Tools :: Multiple Sequence Alignments

Multiple sequence alignments can also be done globally or locally, but optimal alignment algorithms are not practical for more than three sequences because of the amount of computation involved. Therefore multiple sequence alignment programs use heuristic algorithms that trade optimality for speed.

A method used by many global multiple alignment programs (Pileup, Clustal) is progressive alignment: scores are computed for each pair of sequences in the set, a guide tree is derived from the scores, and sequences are added to the alignment in the order indicated by the guide tree. Other programs (DIALIGN, ITERALIGN) use block-based methods. Blocks are highly conserved regions separated by nonconserved regions or gaps.These programs are useful if the sequence set contains some highly divergent sequences, large gaps, or poorly conserved regions.

There are a number of approaches for creating a local multiple alignment. As with global multiple alignments, progressive alignment methods can be used, but based on scores from pair-wise local alignments rather than pair-wise global alignments. Word-based methods (PRALIGN) look for regions that share short matches that are either exact or "close" to exact. Template methods start with a set of template patterns to which all of the sequences are compared. Pair-wise comparisons can also be used, although these methods (MACAW, Vingron and Argos) are slower than the previous methods. Lastly, there are statistical methods that use expectation maximization (MEME) or Gibbs sampling (GIBBS).

Other types of alignments may be useful in phylogenetic work. An example is to align a set of DNA coding regions using a protein sequence as a guide (PROTAL2DNA). This makes it more likely that the DNA alignment makes biological sense; aligning the DNA without the protein context may result in gaps being inserted within codons. There is also a program that uses an iterative process to simultaneously create a multiple alignment and a phylogenetic tree (Jotun Hein's TreeAlign).

Hidden Markov models can be used to align very large sequence sets more rapidly than other methods can. A small "seed" alignment of representative sequences from the set is created using any other multiple alignment method. A hidden Markov model profile representative of the seed alignment is made (HmmerBuild), and this HMM profile is used as a guide to align the remaining sequences to the seed alignment (HmmerAlign).

Back to Sequence Alignments

This website will look much better in a browser that supports web standards, but it has been designed so that it is still usable and accessible to any browser or web-enabled device.