The spatial and temporal patterns of gene transcription are determined by regulatory networks composed of groups of transcription factors (TFs) interacting with clusters of DNA binding sites known as cis-regulatory modules (CRMs). Computational analysis of evolutionarily-conserved TF DNA binding sites is commonly used to predict and analyze CRMs within genomes. These approaches have been limited by the relatively small numbers of TFs with high quality data describing their DNA binding specificity.
The FlyFactorSurvey database summarizes a project using the bacterial one-hybrid method to systematically describe the binding site preferences of transcription factors in Drosophila melanogaster. This effort is a collaboration between the laboratories of Michael Brodsky and Scot Wolfe at the University of Massachusetts Medical School and has been funded by the National Human Genome Research Institute. We have previously described DNA binding specificities for Drosophila homeodomain proteins (Noyes et al., Cell 2008 ) and for TFs that regulate anterior-posterior patterning in the embryo (Noyes et al., N.A.R., 2008 ). We have recently extended this with a large analysis of Cys2His2 Zinc Finger proteins (Enuameh et al. Genome Research 2013), which includes the analysis of a number of alternative splice isoforms that produce TFs with different specificity from the same locus. FlyFactorSurvey provides a searchable database of the DNA binding specificity data from our studies and from other groups using DNase I or SELEX methods. The bacterial one-hybrid binding site data can also be downloaded here.
The database currently contains:
Tools to search genomic sequences for occurrences of these TF binding sites have been developed by our collaborator, Saurabh Sinha at University of Illinois, Urbana-Champaign.Genome Surveyor displays tracts of DNA binding site frequencies along any region of the Drosophila genome using the Gbrowse viewer. The Windowfit analysis program displays the distribution of individual TF binding sites on a DNA segment entered by the user. Tools to predict the specificity of homeodomains in other species based on our dataset have been developed in collaboration with Gary Stormo at Washington University in St. Louis.
We will continue to characterize the specificities of Drosophila factors and to work with our collaborators to develop and improve tools that allow the research community to predict the regulatory function of fly TFs and to build predictive models for TF binding specificity in other species.