Software Tools :: Motifs, Patterns, and Profiles

Background

Motifs, Patterns and Profiles

Programs

Searching sequences for simple user-specified patterns

  • GCG Programs
    • FindPatterns: Searches sequence(s) for short patterns
  • EMBOSS Programs
    • dreg: Regular expression search of a nucleotide sequence
    • preg: Regular expression search of a protein sequence
    • fuzznuc: Nucleic acid pattern search (PROSITE format)
    • fuzzpro: Protein pattern search (PROSITE format)
    • fuzztran: Protein pattern search after translation (PROSITE format)
    • patmatdb: Searches a set of proteins with a motif in PROSITE format

Searching databases of known patterns and motifs with a sequence

  • GCG Programs
    • Motifs: Searches the PROSITE Dictionary of Protein Sites and Patterns with a protein sequence
    • Map: Locates transcription factors using tfsites.dat file
    • MapPlot: Plots locations of transcription factors using tfsites.dat file
  • EMBOSS Programs
    • patmatmotifs: Searches a PROSITE motif database with a protein sequence
    • pscan: Searches the PRINTS database of protein motif fingerprints with a protein
    • tfscan: Scans DNA sequences for transcription factors

Finding potential functional patterns and motifs

  • GCG Programs
    • Terminator: Finds prokaryotic factor-independent RNA polymerase terminators
    • HTHScan: Finds helix-turn-helix motifs in protein sequences
    • SPScan: Finds possible signal sequence cleavage sites in protein sequences
  • EMBOSS Programs
    • isochore: Plots isochores in large DNA sequences
    • cpgplot: Plots CpG-rich areas
    • cpgreport: Reports all CpG-rich regions
    • newcpgreport: Reports all CpG-rich areas
    • newcpgseek: Reports CpG-rich regions
    • marscan: Finds MAR/SAR sites in nucleic sequences
    • helixturnhelix: Reports nucleic acid binding motifs in a protein sequence
    • sigcleave: Reports protein signal cleavage sites

Creating and using profiles

  • GCG Programs
    • ProfileMake: Creates a Gribskov profile from a set of aligned sequences
    • ProfileGap: Aligns a profile to one or more sequences
    • ProfileScan: Searches a database of protein profiles with a protein sequence
    • ProfileSearch: Searches a set of sequences or a sequence database with a profile
    • ProfileSegments: Creates optimal profile-sequence alignments from ProfileSearch results
  • EMBOSS Programs
    • prophecy: Creates a frequency matrix or a profile from a multiple sequence alignment
    • prophet: Aligns a profile created by prophecy to one or more sequences
    • profit: Scans one or more sequences with a frequency matrix created by prophecy

Creating and using hidden Markov model (HMM) profiles

  • GCG Programs
    • HmmerBuild: Creates a profile HMM from aligned sequences
    • HmmerCalibrate: Calibrates existing profile HMMs for more sensitive searches
    • HmmerPfam: Searches a database of profile HMMs (such as Pfam) with a sequence
    • HmmerSearch: Searches a sequence database with a profile HMM
    • HmmerAlign: Aligns one or more sequences to a profile HMM
    • HmmerEmit: Randomly generates sequences that match a given profile HMM
    • HmmerConvert: Converts between HMM and Gribskov profile formats
    • HmmerIndex: Indexes a database of profile HMMs for use by HmmerFetch
    • HmmerFetch: Retrieves a profile HMM by name from an indexed profile HMM database

Discovering new motifs in your sequences

  • GCG Programs
    • MEME: Examines a set of unaligned sequences to find shared motifs
    • MotifSearch: Searches a set of sequences with a motif found by MEME

Related Tools

  • GCG Programs
    • Window: Creates a frequency table of short sequence patterns within a sliding window
    • StatPlot: Plots the frequency table data created by Window
  • EMBOSS Programs
    • freak: Residue/base frequency table or plot

This website will look much better in a browser that supports web standards, but it has been designed so that it is still usable and accessible to any browser or web-enabled device.