Phylip : protdist - Program to compute distance matrix from protein sequences (Felsenstein)



your e-mail

( = required, = conditionally required)



Alignement File : please enter either :
  1. the name of a file:
  2. or the actual data here:

(sequence format)



Distance model (P)

Gamma distribution of rates among positions (G) ? [default] No Yes Gamma+Invariant

Bootstrap options

Weight options

Output options

Categories model options


Bootstrap options

Perform a bootstrap before analysis

Resampling methods

Random number seed (must be odd)

How many replicates



[Return to the main part with your favorite browser's Back function]


Weight options

Use weights for sites (W)



Weights file : please enter
either :
  1. the name of a file:
  2. or the actual data here:





[Return to the main part with your favorite browser's Back function]


Output options

Print out the data at start of run (1)



[Return to the main part with your favorite browser's Back function]


Categories model options

Genetic code (U)

Categorization of amino acids (A) ? [default] G: George/Hunt/Barker C: Chemical H: Hall

Prob change category (1.0=easy) (E)

Transition/transversion ratio (T)

Base frequencies for A, C, G, T/U (separated by commas)



[Return to the main part with your favorite browser's Back function]


your e-mail


Some explanations about the options



Main parameters
enter either the name of a file or the actual data
if you are using Netscape 2.x or later, you can select a file by typing its name, or better, by selecting it with the Netscape file browser (Browse button)
OR you can type your data in the next area, or cut and paste it from another application.
(but not both)


Categories model options
Categorization of amino acids (A)
All have groups: (Glu Gln Asp Asn), (Lys Arg His), (Phe Tyr Trp) plus:
George/Hunt/Barker: (Cys), (Met Val Leu Ileu), (Gly Ala Ser Thr Pro)
Chemical: (Cys Met), (Val Leu Ileu Gly Ala Ser Thr), (Pro)
Hall: (Cys), (Met Val Leu Ileu), (Gly Ala Ser Thr), (Pro)




Bootstrap options
Perform a bootstrap before analysis
By selecting this option, the bootstrap will be performed on your sequence file. So you don't need to perform a separated seqboot before.
Don't give an already bootstrapped file to the program, this won't work!
Resampling methods
1. The bootstrap. Bootstrapping was invented by Bradley Efron in 1979, and its use in phylogeny estimation was introduced by me (Felsenstein, 1985b). It involves creating a new data set by sampling N characters randomly with replacement, so that the resulting data set has the same size as the original, but some characters have been left out and others are duplicated. The random variation of the results from analyzing these bootstrapped data sets can be shown statistically to be typical of the variation that you would get from collecting new data sets. The method assumes that the characters evolve independently, an assumption that may not be realistic for many kinds of data.
2. Delete-half-jackknifing. This alternative to the bootstrap involves sampling a random half of the characters, and including them in the data but dropping the others. The resulting data sets are half the size of the original, and no characters are duplicated. The random variation from doing this should be very similar to that obtained from the bootstrap. The method is advocated by Wu (1986).
3. Permuting species within characters. This method of resampling (well, OK, it may not be best to call it resampling) was introduced by Archie (1989) and Faith (1990; see also Faith and Cranston, 1991). It involves permuting the columns of the data matrix separately. This produces data matrices that have the same number and kinds of characters but no taxonomic structure. It is used for different purposes than the bootstrap, as it tests not the variation around an estimated tree but the hypothesis that there is no taxonomic structure in the data: if a statistic such as number of steps is significantly smaller in the actual data than it is in replicates that are permuted, then we can argue that there is some taxonomic structure in the data (though perhaps it might be just a pair of sibling species).
Sequence format
The sequence will be automatically converted in the format needed for the program
providing you enter a sequence either:
in plain (raw) sequence format or in one of the following known formats:
IG,GenBank,NBRF,EMBL,GCG,DNAStrider,Fitch,fasta,Phylip,PIR,MSF,ASN,PAUP,CLUSTALW
You may enter in the text area a database entry code, or an accession number, in this form:

database:entry_name

or:

database:accession.

References:

Felsenstein, J. 1993. PHYLIP (Phylogeny Inference Package) version 3.5c. Distributed by the author. Department of Genetics, University of Washington, Seattle.

Felsenstein, J. 1989. PHYLIP -- Phylogeny Inference Package (Version 3.2). Cladistics 5: 164-166.

Pise form generator version: 5.a (20 Feb 2009 15:50)