fastDNAml: construction of phylogenetic trees of DNA sequences using maximum likelihood (Olsen, Matsuda, Hagstrom, Overbeek)
Some explanations about the options
Main parameters
- Sequence Alignment File
- The input to fastDNAml is similar to that used by DNAML (and the other PHYLIP programs).
- enter either the name of a file or the actual data
- if you are using Netscape 2.x or later, you can select a file by typing its name, or better, by selecting it with the Netscape file browser (Browse button)
- OR you can type your data in the next area, or cut and paste it from another application.
- (but not both)
-
Categories and Weights Options
- categories file
- The data must have the format specified for PHYLIP dnaml 3.3. The first line must be the letter C, followed by the number of categories (a number in the range 1 through 35), and then a blank-separated list of the rates for each category. (The list can take more than one line; the program reads until it finds the specified number of rate values.) The next line should be the word Categories followed by one rate category character per sequence position. The categories 1 - 35 are represented by the series 1, 2, 3, ..., 8, 9, A, B, C, ..., Y, Z. These latter data can be on one or more lines. For example:
- C 12 0.0625 0.125 0.25 0.5 1 2 4 8 16 32 64 128
- Categories 5111136343678975AAA8949995566778888889AAAAAA9239898629AAAAA9
- Category 'numbers' are ordered: 1, 2, 3, ..., 9, A, B, ..., Y, Z. Category zero (undefined rate) is permitted at sites with a zero in a user-supplied weighting mask.
- weights file (user-specified column weighting information)
- example:
- Weights 111111111111001100000100011111100000000000000110000110000000
- In case of bootstrap, only positions that have nonzero weights are used in computing the bootstrap sample.
-
Rearrangements Options
- Decreases the time in initially placing a new sequence in the growing tree (quickadd)
- This option greatly decreases the time in initially placing a new sequence in the growing tree (but does not change the time required to subsequently test rearrangements). The overall time savings seems to be about 30%, based on a very limited number of test cases. Its downside, if any, is unknown. This will probably become default program behavior in the near future.
- If the analysis is run with a global option of 'G 0 0', so that no rearrangements are permitted, the tree is build very approximately, but very quickly. This may be of greatest interest if the question is, 'Where does this one new sequence fit into this known tree?' The known tree is provided with the restart option, below.
- PHYLIP DNAML does not include anything comparable to the quickadd option.
- global rearrangements
- The G (global) option has been generalized to permit crossing any number of branches during tree rearrangements. In addition, it is possible to modify the extent of rearrangement explored during the sequential addition phase of tree building.
- The G U (global and user tree) option combination instructs the program to find the best of the user trees, and then look for rearrangements that are better still.
- If a rearrangement distance is specified, the input must contain a transition option.
- The Global option can be used to force branch swapping on user trees, (combination of Global and User Tree(s) options).
-
-
User input Tree Options
- This options allows you to enter your own trees and instructs the program to evaluate them.
- User tree - tree(s) file
- The trees must be in Newick format, and terminated with a semicolon. (The program also accepts a pseudo_newick format, which is a valid prolog fact.)
- The tree reader in this program is more powerful than that in PHYLIP 3.3. In particular, material enclosed in square brackets, [ like this ], is ignored as comments; taxa names can be wrapped in single quotation marks to support the inclusion of characters that would otherwise end the name (i.e., '(', ')', ':', ';', '[', ']', ',' and ' '); names of internal nodes are properly ignored; and exponential notation (such as 1.0E-6) for branch lengths is supported.
- user trees to be read with branch lengths
- Causes user trees to be read with branch lengths (and it is an error to omit any of them). Without the L option, branch lengths in user trees are not required, and are ignored if present.
-
Bootstrap Options
- generates a re-sample of the input data (bootstrap)
- tree files will be summarized in one '.tree' file as well as output files in one '.out' file
- random number seed for bootstrap
- Warning: For a given random number seed, the sample will always be the same.
-
Input Options
- ratio of transition to transversion type substitutions
- This option with a value of 2.0 (the program's default value) can be used before a global or treefile option with auxiliary data.
- Randomize the sequence addition order (jumble)
- Note that fastDNAml explores a very small number of alternative tree topologies relative to a typical parsimony program. There is a very real chance that the search procedure will not find the tree topology with the highest likelihood. Altering the order of taxon addition and comparing the trees found is a fairly efficient method for testing convergence. Typically, it would be nice to find the same best tree at least twice (if not three times), as opposed to simply performing some fixed number of jumbles and hoping that at least one of them will be the optimum.
- Sequence format
- The sequence will be automatically converted in the format needed for the program
- providing you enter a sequence either:
- in plain (raw) sequence format or in one of the following known formats:
- IG,GenBank,NBRF,EMBL,GCG,DNAStrider,Fitch,fasta,Phylip,PIR,MSF,ASN,PAUP,CLUSTALW
- You may enter in the text area a database entry code, or an accession number, in this form:
database:entry_name
or:
database:accession.
References:
Olsen, G. J., Matsuda, H., Hagstrom, R., and Overbeek, R. 1994. fastDNAml: A tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Comput. Appl. Biosci. 10: 41-48.
Felsenstein, J. 1981. Evolutionary trees from DNA sequences: A maximum likelihood approach. J. Mol. Evol. 17: 368-376.
Pise form generator version: 5.a (19 Oct 2006 12:35)