******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.3.2 (Release date: Wed Dec 23 18:09:18 EST 2009) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= tests/crp0.s ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ ce1cg 1.0000 105 ara 1.0000 105 bglr1 1.0000 105 crp 1.0000 105 cya 1.0000 105 deop2 1.0000 105 gale 1.0000 105 ilv 1.0000 105 lac 1.0000 105 male 1.0000 105 malk 1.0000 105 malt 1.0000 105 ompa 1.0000 105 tnaa 1.0000 105 uxu1 1.0000 105 pbr322 1.0000 105 trn9cat 1.0000 105 tdc 1.0000 105 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme tests/crp0.s -dna -mod zoops -nmotifs 3 -revcomp model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 8 maxw= 50 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 18 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 1890 N= 18 strands: + - sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.304 C 0.196 G 0.196 T 0.304 Background letter frequencies (from dataset with add-one prior applied): A 0.304 C 0.196 G 0.196 T 0.304 ******************************************************************************** ******************************************************************************** MOTIF 1 width = 18 sites = 18 llr = 180 E-value = 1.1e-006 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 662::1181313141183 pos.-specific C ::211::1412212:8:6 probability G 112:9:913512711:21 matrix T 3349:91:21631482:1 bits 2.4 2.1 * 1.9 * 1.6 * * Relative 1.4 * * Entropy 1.2 **** ** (14.4 bits) 0.9 ***** * *** 0.7 ***** * **** 0.5 ** ***** * * **** 0.2 ** ******** ****** 0.0 ------------------ Multilevel AATTGTGACGTAGATCAC consensus TTA GACT T GA sequence G T C G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- ------------------ ara - 59 2.51e-07 TGGCATAGCA AAGTGTGACGCCGTGCAA ATAATCAATG lac + 9 5.35e-07 AACGCAAT TAATGTGAGTTAGCTCAC TCATTAGGCA malt + 41 8.61e-07 AAAGATTTGG AATTGTGACACAGTGCAA ATTCAGACAC ilv - 43 1.69e-06 GCAAAGGGAA AATTGAGGGGTTGATCAC GTTTTGTACT pbr322 - 57 2.85e-06 CTCCTTACGC ATCTGTGCGGTATTTCAC ACCGCATATG deop2 + 60 2.85e-06 AGATTTCCTT AATTGTGATGTGTATCGA AGTGTGTTGC uxu1 + 17 5.17e-06 AGAGTGAAAT TGTTGTGATGTGGTTAAC CCAATTAGAA trn9cat + 84 5.69e-06 CTTTTGGCGA AAATGAGACGTTGATCGG CACG ce1cg - 65 7.54e-06 GGACTTCCAT TTTTGTGAAAACGATCAA AAAAACAGTC ompa + 48 9.04e-06 TTTTTTTCAT ATGCCTGACGGAGTTCAC ACTTGTAAGT crp - 67 9.89e-06 TACTGCACGG TAATGTGACGTCCTTTGC ATACATGCAG male + 14 1.29e-05 TTACCGCCAA TTCTGTAACAGAGATCAC ACAAAGCGAC gale - 46 1.41e-05 AAGATGCGAA AAGTGTGACATGGAATAA ATTAGTGGAA tdc - 82 1.53e-05 AACAGG ATATGTGCGACCACTCAC AAATTAACTT malk + 61 1.67e-05 ATGTAAGGAA TTTCGTGATGTTGCTTGC AAAAATCGTG cya + 50 1.81e-05 TCAATCAGCA AGGTGTTAAATTGATCAC GTTTTAGACC tnaa + 71 2.73e-05 CTCCCCGAAC GATTGTGATTCGATTCAC ATTTAAACAA bglr1 + 76 8.30e-05 CAAAGTTAAT AACTGTGAGCATGGTCAT ATTTTTATCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ara 2.5e-07 58_[-1]_29 lac 5.4e-07 8_[+1]_79 malt 8.6e-07 40_[+1]_47 ilv 1.7e-06 42_[-1]_45 pbr322 2.8e-06 56_[-1]_31 deop2 2.8e-06 59_[+1]_28 uxu1 5.2e-06 16_[+1]_71 trn9cat 5.7e-06 83_[+1]_4 ce1cg 7.5e-06 64_[-1]_23 ompa 9e-06 47_[+1]_40 crp 9.9e-06 66_[-1]_21 male 1.3e-05 13_[+1]_74 gale 1.4e-05 45_[-1]_42 tdc 1.5e-05 81_[-1]_6 malk 1.7e-05 60_[+1]_27 cya 1.8e-05 49_[+1]_38 tnaa 2.7e-05 70_[+1]_17 bglr1 8.3e-05 75_[+1]_12 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=18 seqs=18 ara ( 59) AAGTGTGACGCCGTGCAA 1 lac ( 9) TAATGTGAGTTAGCTCAC 1 malt ( 41) AATTGTGACACAGTGCAA 1 ilv ( 43) AATTGAGGGGTTGATCAC 1 pbr322 ( 57) ATCTGTGCGGTATTTCAC 1 deop2 ( 60) AATTGTGATGTGTATCGA 1 uxu1 ( 17) TGTTGTGATGTGGTTAAC 1 trn9cat ( 84) AAATGAGACGTTGATCGG 1 ce1cg ( 65) TTTTGTGAAAACGATCAA 1 ompa ( 48) ATGCCTGACGGAGTTCAC 1 crp ( 67) TAATGTGACGTCCTTTGC 1 male ( 14) TTCTGTAACAGAGATCAC 1 gale ( 46) AAGTGTGACATGGAATAA 1 tdc ( 82) ATATGTGCGACCACTCAC 1 malk ( 61) TTTCGTGATGTTGCTTGC 1 cya ( 50) AGGTGTTAAATTGATCAC 1 tnaa ( 71) GATTGTGATTCGATTCAC 1 bglr1 ( 76) AACTGTGAGCATGGTCAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 18 n= 1584 bayes= 6.57872 E= 1.1e-006 101 -1081 -182 13 87 -1081 -82 13 -45 -23 18 35 -1081 -82 -1081 155 -1081 -182 227 -1081 -145 -1081 -1081 155 -245 -1081 218 -245 145 -82 -182 -1081 -145 99 50 -45 13 -182 135 -145 -145 18 -82 87 -13 18 18 -13 -145 -182 188 -145 35 -23 -182 35 -245 -1081 -82 145 -245 199 -1081 -87 135 -1081 18 -1081 -13 164 -182 -245 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 18 nsites= 18 E= 1.1e-006 0.611111 0.000000 0.055556 0.333333 0.555556 0.000000 0.111111 0.333333 0.222222 0.166667 0.222222 0.388889 0.000000 0.111111 0.000000 0.888889 0.000000 0.055556 0.944444 0.000000 0.111111 0.000000 0.000000 0.888889 0.055556 0.000000 0.888889 0.055556 0.833333 0.111111 0.055556 0.000000 0.111111 0.388889 0.277778 0.222222 0.333333 0.055556 0.500000 0.111111 0.111111 0.222222 0.111111 0.555556 0.277778 0.222222 0.222222 0.277778 0.111111 0.055556 0.722222 0.111111 0.388889 0.166667 0.055556 0.388889 0.055556 0.000000 0.111111 0.833333 0.055556 0.777778 0.000000 0.166667 0.777778 0.000000 0.222222 0.000000 0.277778 0.611111 0.055556 0.055556 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AT][AT][TAG]TGTGA[CGT][GA][TC][ATCGC]G[AT]TC[AG][CA] -------------------------------------------------------------------------------- Time 2.59 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 width = 8 sites = 2 llr = 24 E-value = 1.5e+004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::::: pos.-specific C a::5:::: probability G :aa:aaaa matrix T :::5:::: bits 2.4 *** **** 2.1 *** **** 1.9 *** **** 1.6 *** **** Relative 1.4 *** **** Entropy 1.2 *** **** (17.5 bits) 0.9 ******** 0.7 ******** 0.5 ******** 0.2 ******** 0.0 -------- Multilevel CGGCGGGG consensus T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- -------- ilv + 5 2.18e-06 GCTC CGGCGGGG TTTTTTGTTA male + 41 5.56e-06 CACAAAGCGA CGGTGGGG CGTAGGGGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ilv 2.2e-06 4_[+2]_93 male 5.6e-06 40_[+2]_57 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=8 seqs=2 ilv ( 5) CGGCGGGG 1 male ( 41) CGGTGGGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1764 bayes= 9.783 E= 1.5e+004 -765 235 -765 -765 -765 -765 235 -765 -765 -765 235 -765 -765 135 -765 71 -765 -765 235 -765 -765 -765 235 -765 -765 -765 235 -765 -765 -765 235 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 2 E= 1.5e+004 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CGG[CT]GGGG -------------------------------------------------------------------------------- Time 4.17 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 width = 11 sites = 2 llr = 31 E-value = 2.0e+004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::::::::: pos.-specific C aa::::::5aa probability G :::aa5a:5:: matrix T ::a::5:a::: bits 2.4 ** ** * ** 2.1 ** ** * ** 1.9 ** ** * ** 1.6 ***** ** ** Relative 1.4 ***** ***** Entropy 1.2 ***** ***** (22.3 bits) 0.9 *********** 0.7 *********** 0.5 *********** 0.2 *********** 0.0 ----------- Multilevel CCTGGGGTCCC consensus T G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- ----------- lac - 33 7.89e-08 AAGTGTAAAG CCTGGGGTGCC TAATGAGTGA trn9cat + 36 2.01e-07 ATAAATAAAT CCTGGTGTCCC TGTTGATACC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- lac 7.9e-08 32_[-3]_62 trn9cat 2e-07 35_[+3]_59 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=11 seqs=2 lac ( 33) CCTGGGGTGCC 1 trn9cat ( 36) CCTGGTGTCCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 11 n= 1710 bayes= 9.73809 E= 2.0e+004 -765 235 -765 -765 -765 235 -765 -765 -765 -765 -765 171 -765 -765 235 -765 -765 -765 235 -765 -765 -765 135 71 -765 -765 235 -765 -765 -765 -765 171 -765 135 135 -765 -765 235 -765 -765 -765 235 -765 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 11 nsites= 2 E= 2.0e+004 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CCTGG[GT]GT[CG]CC -------------------------------------------------------------------------------- Time 5.75 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ce1cg 7.53e-03 64_[-1(7.54e-06)]_23 ara 9.02e-04 58_[-1(2.51e-07)]_29 bglr1 2.01e-01 75_[+1(8.30e-05)]_12 crp 3.77e-02 66_[-1(9.89e-06)]_21 cya 6.49e-02 49_[+1(1.81e-05)]_38 deop2 1.19e-02 59_[+1(2.85e-06)]_28 gale 6.15e-02 45_[-1(1.41e-05)]_42 ilv 4.50e-06 4_[+2(2.18e-06)]_30_[-1(1.69e-06)]_45 lac 1.93e-07 8_[+1(5.35e-07)]_6_[-3(7.89e-08)]_62 male 1.22e-04 13_[+1(1.29e-05)]_9_[+2(5.56e-06)]_57 malk 1.39e-02 60_[+1(1.67e-05)]_27 malt 5.01e-03 40_[+1(8.61e-07)]_47 ompa 3.94e-02 47_[+1(9.04e-06)]_40 tnaa 2.80e-02 74_[-1(1.41e-05)]_13 uxu1 1.70e-02 16_[+1(5.17e-06)]_71 pbr322 6.62e-03 56_[-1(2.85e-06)]_31 trn9cat 5.41e-06 35_[+3(2.01e-07)]_37_[+1(5.69e-06)]_4 tdc 6.53e-02 81_[-1(1.53e-05)]_6 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: tlb-squirrel ********************************************************************************