A Gateway-Compatible Yeast One-Hybrid System
Abstract
Since the advent of microarrays, vast amounts of gene expression data have been generated. However, these microarray data fail to reveal the transcription regulatory mechanisms that underlie differential gene expression, because the identity of the responsible transcription factors (TFs) often cannot be directly inferred from such data sets. Regulatory TFs activate or repress transcription of their target genes by binding to cis-regulatory elements that are frequently located in a gene's promoter. To understand the mechanisms underlying differential gene expression, it is necessary to identify physical interactions between regulatory TFs and their target genes. We developed a Gateway-compatible yeast one-hybrid (Y1H) system that enables the rapid, large-scale identification of protein-DNA interactions using both small (i.e., DNA elements of interest) and large (i.e., gene promoters) DNA fragments. We used four well-characterized Caenorhabditis elegans promoters as DNA baits to test the functionality of this Y1H system. We could detect ∼40% of previously reported TF-promoter interactions. By performing screens using two complementary libraries, we found novel potentially interacting TFs for each promoter. We recapitulated several of the Y1H-based protein-DNA interactions using luciferase reporter assays in mammalian cells. Taken together, the Gateway-compatible Y1H system will allow the high-throughput identification of protein-DNA interactions and may be a valuable tool to decipher transcription regulatory networks.
Differential gene expression is a major determinant in development. Each gene exhibits a specific temporal and spatial expression pattern and level, and as a result, each cell/tissue expresses a unique subset of proteins. Differential gene expression can be regulated at different steps, including protein synthesis as well as protein and mRNA degradation. However, it is widely appreciated that developmental gene expression patterns are predominantly established at the level of transcription regulation (Lee and Young 2000). Specifically, differential gene expression is controlled by regulatory transcription factors (TFs) that bind to cis-regulatory DNA elements (binding sites) that are often located in or near a gene's promoter. Between 5% and 10% of metazoan genes encode putative TFs (Levine and Tjian 2003). With ∼14,000-30,000 genes predicted in metazoan genomes (The C. elegans Sequencing Consortium 1998; Adams et al. 2000; Lander et al. 2001; Venter et al. 2001), this translates to hundreds or thousands of predicted TFs.
The availability of complete genome sequences and the development of functional genomic technologies enable the assessment of questions relating to differential gene expression on a genome-wide scale. For example, the use of microarrays allows the profiling of genome-wide gene expression under different experimental conditions (Schena et al. 1995). The subsequent use of clustering algorithms enables the identification of genes with similar expression profiles that may be involved in similar biological processes (Eisen et al. 1998). After aligning the promoter sequences of such genes, cis-regulatory elements can be found that may be responsible for their coexpression (Tavazoie et al. 1999). Although microarray data are informative to provide gene expression “signatures” for tissues and/or organisms under various experimental conditions, these data fail to reveal the transcription regulatory mechanisms that control differential gene expression. This is because not all functional elements can be detected by promoter alignments. In addition, the TFs that bind to elements that can be identified may not have been experimentally characterized yet. Moreover, many TFs belong to larger TF families in which members share similar DNA-binding domains and, potentially, overlapping DNA recognition elements. As a result, even though a family of TFs can sometimes be inferred from aligned promoter sequences, the exact TF member responsible for the coexpression may remain elusive. An additional explanation for the lack of inference of transcriptional mechanisms from expression profiling data is that it is not possible to discriminate between direct and indirect transcriptional effects that lead to the differences in mRNA levels. Finally, mRNA levels measured by microarrays are not only the result of transcription, but are also the result of a balance between mRNA synthesis and degradation. Thus, to gain insight into the transcriptional mechanisms that lead to differential gene expression, genes need to be experimentally “matched” with the TFs that regulate their expression. One method of doing this is by identifying protein-DNA interactions between regulatory TFs and regulatory DNA elements (e.g., cis-regulatory elements or gene promoters).
Several biochemical methods have been developed to identify protein-DNA interactions, including gel shift, DNAse I footprinting, and chromatin-immunoprecipitation (ChIP) assays (Latchman 1998; Orlando 2000). ChIP assays are particularly powerful because they allow the identification of protein-DNA interactions that occur in vivo. This approach involves the precipitation of a DNA-binding protein together with its associated DNA, and has been used to verify target gene binding by several TFs (Shang et al. 2000; Wells et al. 2002; Conkright et al. 2003). Initially, this method has been used to characterize the interactions of individual TFs with one or a few of their target genes. To increase the throughput of the assay and to allow the unbiased identification of TF-DNA interactions throughout the genome, ChIP experiments have recently been combined with microarray technologies (Ren et al. 2000; Iyer et al. 2001). This combined method has been used to map protein-DNA interactions for 106 of the 141 predicted regulatory TFs in Saccharomyces cerevisiae (Lee et al. 2002). The data obtained were used to model yeast transcription regulatory networks. It is challenging to perform ChIP assays on a large scale using intact metazoan model organisms. This is because antibodies have to be generated for each TF, or, alternatively, transgenic strains expressing epitope-tagged TFs need to be produced. Both are time-consuming and therefore only feasible for a single or handful of TFs. Also, it is difficult to detect interactions with metazoan TFs that are expressed at low levels, in a small number of cells, or during a narrow developmental time interval. In addition, with ChIPs, one usually cannot discriminate between different TF isoforms. Further, analysis of ChIPs with microarrays requires the generation of comprehensive arrays containing regulatory genomic sequences (e.g., promoters). Finally, although ChIP experiments are valuable to answer the question, “Which DNA fragments does a TF of interest bind to?”, they are less suitable to address the converse question, “Which TFs bind to a DNA fragment (e.g., promoter) of interest?”
The yeast one-hybrid (Y1H) system is a suitable method to answer the second question because it allows the identification of proteins that can bind to DNA elements of interest, including cis-regulatory elements, origins of DNA replication, and telomeres (Li and Herskowitz 1993; Lehming et al. 1994; Kim et al. 2003). The Y1H system is conceptually similar to the yeast two-hybrid system (Y2H) that is used for the detection of protein-protein interactions (Fields and Song 1989). In the Y2H system, two hybrid proteins are used. The bait protein (X) is fused to a DNA-binding domain (DB), and the prey protein (Y) is fused to a transcription activation domain (AD). When X and Y physically interact with each other, a functional TF is reconstituted and reporter gene expression is activated. In the Y1H system, a single hybrid protein, AD-Y, is used, and reporter gene expression is activated when Y interacts with the DNA bait. Although many predicted regulatory TFs contain an intrinsic AD, several TFs have a repressor domain or no activation/repressor domain at all. In addition, DNA-binding proteins that do not function in transcription (e.g., replication and DNA repair proteins) do not contain an AD. To enable the identification of a variety of DNA-binding proteins, a strong, heterologous AD is added to the prey protein. So far, the Y1H system has been mainly used with multiple copies of short DNA elements (up to 30 bp) as DNA baits. To facilitate the high-throughput, unbiased identification of protein-DNA interactions, we developed a Gateway-compatible Y1H system. This system can be used with both small (e.g., cis-regulatory elements), and with single copies of large DNA fragments (e.g., gene promoters). This is important because the use of promoters circumvents the need to identify functional cis-regulatory elements for a gene of interest.
The Gateway cloning system facilitates accurate, high-efficiency cloning of multiple DNA fragments into various vectors in vitro because it is based on recombination and does not require the use of restriction enzymes (Hartley et al. 2000; Walhout et al. 2000a). Previously, we developed a Y2H system that is compatible with the Gateway cloning system (Walhout and Vidal 2001). Combining Gateway with Y2H greatly enhanced the throughput of protein-protein interaction identification and analysis. Indeed, already >5500 protein-protein interactions have been identified using this system (Walhout et al. 2000a, 2002; Davy et al. 2001; Boulton et al. 2002; Li et al. 2004). Therefore, the simultaneous cloning of multiple DNA baits into Y1H vectors by Gateway cloning is expected to greatly enhance the throughput of the Y1H method as well.
Acknowledgments
The authors thank Nele Gheldof, Job Dekker, Claude Gazin, and members of the Walhout lab for helpful discussions and critical reading of the manuscript. We thank Glenn Maston for providing the Gateway-compatible luciferase reporter vector, and Michael Brasch for helpful discussions. This work was supported by the NCI 4 R33 CA097516-02 grant awarded to M.V. and A.J.M.W. as part of a promoterome consortium that also includes Ian Hope (University of Leeds, UK). B.D. was supported by a Henri Benedictus-BAEF fellowship in Biomedical Engineering of the King Baudouin Foundation and the Belgian American Educational Foundation.
Footnotes
[The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper: G. Maston.]
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.2445504.






