PhyloGibbs Algorithm development and code copyright (C) 2005 by Rahul Siddharthan (The Institute of Mathematical Sciences, Chennai, India) Eric D. Siggia (The Rockefeller University, New York, USA) Erik van Nimwegen (Biozentrum, University of Basel, and Swiss Institute of Bioinformatics, Switzerland) Redistribution permitted under the terms of the GNU General Public License (see file COPYING in source distribution). Please send all questions and bugreports to Erik van Nimwegen . This is a quick summary of usage options. For details, see the phylogibbs(1) and phylogibbs_algorithm(7) manpages (included with the source code as phylogibbs.1 and phylogibbs_algorithm.7). Quick start (phylogenetically unrelated sequences): phylogibbs -D 0 -m motifwidth [-z number of motifs] [-y total number of sites in all motifs] -f input_seqfile Quick start (with phylogenetically related sequences): phylogibbs [-D 1] -L Newick formatted tree -m motifwidth [-z number of motifs] [-y total number of sites in all motifs] -f input_seqfile Commonly used command-line options: -c, --ncolmoves n : do n colour-change moves per cycle (-1 = autoselect n) default 0. -D, --dialign n : (n=0) No alignment; (n=1) Loose align; (n=2) Strict align. -F, --bgfile filename : Read background sequence from filename. -f, --inputfile filename : Read input fasta sequence from filename. -L, --labeltree treestring: Specifies the phylogenetic tree for the species from which the input sequences derive. -m, --motifwidth n : Search for motifs of width m. -M, --motiffile filename : File with external WMs to act as "seeds". -N, --ncorrel n : Order of the Markov chain for the background model. (n-site, n=-1 = 0.25 each,n=0 no correlations with neighbors, n>0 n nearest neighbors.) -o, --outputfile filename : Write anneal snapshots to filename -q, --quiet : Run quietly, no screen output -R, --reverseprint : Print (negative) position backwards from end of sequence. Useful when end of sequence corresponds to transcription start. -r, --norevcomp : Search for motifs only on one strand. For instance when running on RNA sequences. -S, --ncycles n : Do n cycles of the tracking phase. Sets running time. -t, --trackedoutput filename : Write tracking statistics to filename -y, --nexpwin num : Sets total number of sites. Either fixed (when no colour-moves or the expected number (maximal entropy prior). -z, --nexpcol num : Sets number of motifs. Upper bound when no colour-moves and sets expected number otherwise (maximal entropy prior). Other command-line options: -A, --trackfile filename : Track labelled clusters specified in filename -a, --nanneal n : do n cycles of the simulated anneal phase. -B, --blockedfile filename : Read a list of blocked windows from filename -b, --beta value : set initial inverse temperature to value -C, --rcsymmetric : Search for reverse-complement-symmetric motifs -E, --trackingcutoff e: Cut-off (float) for printing tracking statistics -g, --ndeepquench n : do n cycles of the deep quench phase -h, --help : Print this quick help summary and exit -I, --initialocc list : n initial windows per colour, for n in list (comma-sep) -i, --initfile filename : Read initial window config from filename. -P, --bgpscount value : Weigh backgrnd model with pseudocount of 1-site freqs. -p, --chempot value : Use "chemical potential" of given value for new sites. -s, --nshiftmoves n : n global-shift moves per cycle -T, --pseudocount value : Use pseudocount for prior in weight-matrix integral. -u, --ntransient n : do n cycles of the transient equilibriation phase. -v, --verbose : verbose output. -W, --write-each-cycle: write output file and tracking file every cycle. -w, --nwinmoves n : n window-shift moves per cycle. -X, --noautotrack : Stop after anneal (don't autotrack annealed clusters). -x, --betaincr value : at each anneal cycle, add value to beta.