PMID- 21330290 OWN - NLM STAT- MEDLINE DCOM- 20110829 LR - 20240104 IS - 1367-4811 (Electronic) IS - 1367-4803 (Print) IS - 1367-4803 (Linking) VI - 27 IP - 7 DP - 2011 Apr 1 TI - FIMO: scanning for occurrences of a given motif. PG - 1017-8 LID - 10.1093/bioinformatics/btr064 [doi] AB - A motif is a short DNA or protein sequence that contributes to the biological function of the sequence in which it resides. Over the past several decades, many computational methods have been described for identifying, characterizing and searching with sequence motifs. Critical to nearly any motif-based sequence analysis pipeline is the ability to scan a sequence database for occurrences of a given motif described by a position-specific frequency matrix. RESULTS: We describe Find Individual Motif Occurrences (FIMO), a software tool for scanning DNA or protein sequences with motifs described as position-specific scoring matrices. The program computes a log-likelihood ratio score for each position in a given sequence database, uses established dynamic programming methods to convert this score to a P-value and then applies false discovery rate analysis to estimate a q-value for each position in the given sequence. FIMO provides output in a variety of formats, including HTML, XML and several Santa Cruz Genome Browser formats. The program is efficient, allowing for the scanning of DNA sequences at a rate of 3.5 Mb/s on a single CPU. AVAILABILITY AND IMPLEMENTATION: FIMO is part of the MEME Suite software toolkit. A web server and source code are available at http://meme.sdsc.edu. FAU - Grant, Charles E AU - Grant CE AD - Department of Genome Sciences, University of Washington, Seattle, WA, USA. FAU - Bailey, Timothy L AU - Bailey TL FAU - Noble, William Stafford AU - Noble WS LA - eng GR - R01 RR021692/RR/NCRR NIH HHS/United States GR - 2 R01 RR021692/RR/NCRR NIH HHS/United States PT - Journal Article PT - Research Support, N.I.H., Extramural DEP - 20110216 PL - England TA - Bioinformatics JT - Bioinformatics (Oxford, England) JID - 9808944 RN - 0 (CCCTC-Binding Factor) RN - 0 (CTCF protein, human) RN - 0 (Repressor Proteins) RN - 9007-49-2 (DNA) SB - IM MH - *Amino Acid Motifs MH - Base Sequence MH - Binding Sites MH - CCCTC-Binding Factor MH - Conserved Sequence MH - DNA/*chemistry MH - Databases, Genetic MH - Genome, Human MH - Humans MH - Position-Specific Scoring Matrices MH - Repressor Proteins/metabolism MH - Sequence Analysis, DNA/*methods MH - Sequence Analysis, Protein/*methods MH - *Software PMC - PMC3065696 EDAT- 2011/02/19 06:00 MHDA- 2011/08/30 06:00 PMCR- 2011/02/16 CRDT- 2011/02/19 06:00 PHST- 2011/02/19 06:00 [entrez] PHST- 2011/02/19 06:00 [pubmed] PHST- 2011/08/30 06:00 [medline] PHST- 2011/02/16 00:00 [pmc-release] AID - btr064 [pii] AID - 10.1093/bioinformatics/btr064 [doi] PST - ppublish SO - Bioinformatics. 2011 Apr 1;27(7):1017-8. doi: 10.1093/bioinformatics/btr064. Epub 2011 Feb 16.