Software Tools :: Genefinding and Codon Analysis

Gene finding programs have used three approaches in predicting the locations of genes:

  • search by content -- locating open reading frames, regions that have G+C content and codon usage characteristic of coding regions, etc.
  • search by signal -- locating short sequence motifs associated with genes, such as promoter and transcription factor binding sites, or splice sites
  • search by homology -- using the sequence of a gene from one organism to identify homologs in other organisms; aligning EST sequences to genomic sequences

Today the most accurate gene-finding methods use machine learning techniques such as neural nets, decision trees, and hidden Markov models to evaluate the information from all of these "traditional" approaches in order to make a prediction. These programs are trained using a database of known genes and may not yield accurate predictions for organisms that weren't represented in the training set.

Back to Genefinding and Codon Analysis

This website will look much better in a browser that supports web standards, but it has been designed so that it is still usable and accessible to any browser or web-enabled device.