A high-resolution map of three-dimensional chromatin interactome in human cells
A large number of cis-regulatory sequences have been annotated in the human genome1,2, but defining their target genes remains a challenge3. One strategy is to identify the long-range looping interactions at these elements with the use of chromosome conformation capture (3C) based techniques4. However, previous studies lack either the resolution or coverage to permit a whole-genome, unbiased view of chromatin interactions. Here, we report a comprehensive chromatin interaction map generated in human fibroblasts using a genome-wide 3C analysis method (Hi-C)5. We determined over one million long-range chromatin interactions at 5–10kb resolution, and uncovered general principles of chromatin organization at different types of genomic features. We also characterized the dynamics of promoter-enhancer contacts upon TNF-α signaling in these cells. Unexpectedly, we found that TNF-α responsive enhancers are already in contact with their target promoters prior to signaling. Such pre-existing chromatin looping, which also exists in other cell types with different extra-cellular signaling, is a strong predictor of gene induction. Our observations suggest that the three-dimensional chromatin landscape, once established in a particular cell type, is rather stable and could influence the selection or activation of target genes by a ubiquitous transcription activator in a cell-specific manner.
We carried out Hi-C experiments to study the dynamic chromatin interactions in a primary human fibroblast cells (IMR90) in response to transient TNF-α signaling. Combining the Hi-C data from IMR90 cells before and after 1hr TNF-α treatment, we obtained a total of ~3.4 billion uniquely mapped paired-end reads from 6 biological replicates in each condition, among which ~1.4 billion are intra-chromosomal reads (Supplementary Table S1–S2). In order to accurately identify chromatin looping interactions with high sensitivity and resolution, we devised an improved data filtering strategy6 based on the strand orientation of Hi-C paired-end reads (Supplementary Fig. 1–6, Supplementary Methods), which results in over 500 million high-confidence read pairs (Supplementary Table S1–S2), each representing a legitimate ligation event between two restriction fragments on the same chromosome. Recognizing that some reads may be due to random collision events between restriction fragments4,7, we also estimated the expected frequency between any two restriction fragments, and then fitted a negative binomial model to assess the significance of observed contact frequency (Supplementary Methods, Supplementary Fig. 7–9). Compared to previous methods4, our data analysis method permits detection of chromatin interactions at short distance. For example, we observed asymmetric distribution of cis-contacts from highly expressed promoters to the immediate downstream gene bodies (Supplementary Fig. 10). This observation is reminiscent of a recent study showing interactions between a subset of exons and their promoters8. Interestingly, while such bias at promoter is correlated with elongation of RNA polymerase II, it remains when transcription elongation is blocked by the pTEF-b inhibitor flavopiridol (Supplementary Fig. 11), suggesting that the maintenance of promoter/gene body contacts is independent of active transcription.
In order to accurately map at high resolution the chromatin interactions genome-wide, we devised an algorithm (Supplementary Fig. 12) to identify statistically significant looping interactions centered on a given genomic region from Hi-C contact matrix (Fig. 1a). Applying this method to the CCL2 locus, we were able to determine the distal enhancers and CTCF binding sites interacting with the CCL2 promoter (Fig. 1a–b). Our algorithm also identified a number of previously reported long-range chromatin interactions at the HoxA gene cluster9 and the SHH locus10, which were not readily observable from lower resolution analysis (Supplementary Fig. 13–14). We further performed conventional 3C experiments to validate 6 pairs of long-range interactions identified at 5 different genes, and the results confirmed the reliability of our method (Fig. 1c, Supplementary Fig. 15).
We next applied the above algorithm to the 518,032 anchor regions in the human genome, with each containing one or a few HindIII restriction fragments (fragments shorter than 2kb are merged) (Fig. 2a), and uncovered a total of 1,116,312 chromatin interactions with a false discovery rate (FDR) of 0.1 (Supplementary Data). We found that strong interactions supported by lower p-values and higher contact frequencies are more reproducible between biological replicates (Supplementary Fig. 16). Since interactions between loci separated by more than 2Mb are very rare (Fig. 2c), we limit our search to this genomic span. The sizes of the identified interacting DNA loci range from several hundred base pairs to over 50kb, with a median of 10.5kb (Fig. 2b). We were able to identify chromatin interactions that span a genomic distance from several hundred base pairs to over 1 million base pairs (Fig. 2c). Consistent with previous reports that the genome is partitioned into megabase-sized topological domains11–13, we found that a majority of the identified chromatin interactions in the IMR90 cells are located within the same topological domains (Fig. 2d, Supplementary Fig. 18).
We next characterized the chromatin interactions centered on the cis-elements annotated in the IMR90 cell genome14 (Supplementary Data). Chromatin looping interactions are significantly enriched at cis-regulatory elements, especially active promoters, enhancers and CTCF binding sites, and are rare at inactive TSSs or regions with repressive chromatin domains marked by H3K27me3 (Fig. 2e–f)15; notably, both active and poised enhancers (distinguished be the status of H3K27ac)16–18 are found equally likely to engage looping interactions (Fig. 2e–f), raising the possibility that DNA looping could take place after priming of enhancers by H3K4me1 but before further activation19. Interestingly, the chromatin interactions centered on CTCF binding sites tend to occur over a longer range than other types of cis-elements (Fig. 2g), confirming a recent result obtained from selected loci20. We also explored the spatial organization of cis-elements by examining preferential interaction between different classes of elements. Strongest enrichment was observed between H3K27me3 marked regions (Fig. 2h), consistent with the known compact 3D structure at this type of repressive chromatin domains21 (e.g. Supplementary Fig.14a). The inactive promoters tend to interact with regions depleted of enhancers but enriched for repressive mark H3K27me3 (Fig. 2h), while CTCF binding sites loop to both active and inactive promoters with no preference, as also reported previously15. It is also interesting to observe that CTCF binding sites seem to prefer promoters over enhancers (Fig. 2h), suggesting a specific role for CTCF in organizing long-range chromatin interactions to promoters.
Looping interactions between cis-regulatory elements and gene promoters have been shown to be important for transcription regulation at a number of loci3. The genome-wide identification of chromatin interactions in the IMR90 cells allowed us to examine this concept systematically. We first focused on the looping interactions anchored to gene promoters, and denote the identified interacting sequences as “promoter tethered regions” (PTRs) (Fig. 3a). In IMR90 cells, we found 57,585 PTRs identifying 29,132 enhancer-promoter (EP) pairs involving 6,133 active promoters and 15,432 distal active enhancers (Supplementary Data). Only around 25% of EP pairs are within 50kb range, and ~57% span 100kb or larger genomic distance, with a median distance of 124kb (Fig. 3b). We assigned 55% of distal enhancers to at least one active promoter, and 25% of enhancers to 2 or more active promoters (Fig. 3c, left panel). This result confirms previous observation that promoters and enhancers often form complex networks to regulate transcription15. We further hypothesized that genes sharing common enhancers (denoted hub enhancers) are likely to have coordinated gene expression patterns. Indeed, genes sharing the same NF-κB responsive enhancers are more frequently induced together by TNF-α than expected by chance (Fig. 3d). As an example, CKAP2L and IL1A are induced simultaneously by TNF-α although lacking promoter bound p65 peaks, and they share overlapping distal PTR regions containing multiple NF-κB binding sites (Fig. 3e). Similar examples can be found in other gene clusters co-induced by TNF-α treatment (Supplementary Fig. 19). These results therefore provide a molecular mechanism for coordinated gene expression of neighboring genes.
Interestingly, 46% of the active genes do not interact with any distal enhancer (Fig. 3c, right panel). Gene ontology analysis showed that these genes are enriched with housekeeping genes (Supplementary Fig. 20a). On the other hand, 54% of the active promoters demonstrate extensive looping interacting with enhancers (average 4.75 enhancers per gene, Fig. 3c, right panel), and they are enriched with genes related to biological pathways such as signal transduction (Supplementary Fig. 20b). This analysis suggests that housekeeping genes, despite being highly transcribed, do not engage a lot of distal regulatory elements. On the other hand, genes involved in cell specific functions are under extensive control of distal regulator sequences.
We next examined long range looping interactions at transcriptional enhancers, focusing on those bound by the p65 subunit of NF-κB transcription factor. Using ChIP-seq, we identified 15,621 p65 binding sites in the genome after TNF-α treatment, 2,315 (14.8%) of which can be classified as “active p65 binding sites” because they exhibit increased H3K27ac levels and eRNA expression upon TNF-α signaling (Supplementary Fig. 21–22). Consistent with their putative role in mediating transcriptional induction, these “active p65 binding sites” are enriched near TNF-α dependent genes (Supplementary Fig. 22c). We next tested if the long-range interactions between these p65 binding sites and their target promoters are correlated with transcriptional induction. Indeed, at the promoters that exhibit interactions with one or more active p65 binding sites, significantly higher levels of transcriptional induction were observed than the promoters that do not interact with distal p65 binding sites (Fig. 3f), suggesting that the identified long-range chromatin interactions may play a key role in transcriptional regulation of the TNF-α inducible genes.
The high-resolution map of chromatin interactions may also improve the prediction of target genes of distal enhancers. Currently, a common practice is to assign distal enhancers to their nearest promoters, assuming one enhancer is linked to just one target gene (proximity approach). This approach, however, cannot explain all of the 828 TNF-α responsive genes. We found 331 (40%) of these genes have one or more p65 binding sites within 2.5kb of their promoters, and 362 genes of the remainder can be assigned to one or more NF-κB binding sites by proximity approach, leaving still 135 TNF-α induced gene unexplained. Using a recently published enhancer-promoter connection map22 based on correlated chromatin features across diverse tissues or cell types, we were able to link 10 of the 135 unexplained TNF-α inducible genes to distal NF-κB binding sites. Using the chromatin interactome map, 74 (55%) of the unexplained genes can be assigned to a NF-κB binding site (Supplementary Fig. 23). This result illustrates that the chromatin interactome map described here could be valuable for the study of long-range regulation of gene expression by transcription factors.
We found no obvious alterations of mega-base topological domains12 in IMR90 cells upon TNF-α treatment (for example, Supplementary Fig. 13). Since previous studies have shown that gene activation by enhancers is accompanied by alteration of chromatin interactions3,23,24, we expected that at shorter distance, binding of NF-κB to enhancers would induce looping interactions that bring the distal enhancers to proximity with target genes. To our surprise, we found that at the vast majority of TNF-α responsive enhancers, there is little change of DNA looping after treatment (Fig. 4a). These results suggest that in general, enhancer-promoter interactions already form in untreated cells; and these pre-existing DNA-structures are not significantly altered by transient activation or repression of enhancers. 3C assays confirm that DNA looping exists at several loci prior to TNF-α treatment (Fig. 1e, Supplementary Fig. 15a–c). We further compared the normalized reads count anchored to the TNF-α activated enhancers at different ranges of genomic distances (Fig. 4b, Supplementary Methods). Consistent with the results in Fig. 4a, transient activation of p65 bound enhancers does not lead to significant changes in chromatin interactions (Fig. 4b). By contrast, the chromatin interactions (especially within a short genomic distance) at cell type specific enhancers are highly variable between cell types (Fig. 4c–d), suggesting the existence of specific chromatin interaction structures at cell type specific enhancers. Interestingly, the discrepancy between signal-dependent and cell-type specific enhancers is well correlated with the levels of H3K4me1 at those dynamically regulated enhancers: Despite the quick induction of H3K27ac mark at TNF-α responsive enhancers, strength of H3K4me1 signal are largely unchanged (Fig. 4e); on the other hand, cell type specific enhancers have highly cell-specific H3K4me1 occupancy (Fig. 4f).
Recently, pre-existing promoter-enhancer looping was reported at several loci induced by p53, FOXO3 and glucocorticoid receptor using 4C approach20,25,26. Our genome-wide analysis of chromatin interactome maps in IMR90 cells suggests that this is likely a common rule rather than a special case. To further demonstrate its generality, we examined 6 additional promoter-enhancer pairs by 3C assays in four different cell types (IMR90, HUVEC, MCF7 and LNCaP cells) under different stimuli (IFN-γ, TNF-α, β-estradiol and 5α-dihydrotestosterone, respectively). In all of these examples, we found evident pre-existing promoter-enhancer contacts, and the looping interactions are largely unchanged after enhancer activation and target gene induction (Supplementary Fig 15d, 24). Our results predict that pre-existing chromatin looping interactions could dictate the spectrum of the target genes for a transcription factor even before it is activated. Indeed, p65 binding sites looping to the promoters prior to induction are much more likely to result in transcriptional activation of the linked gene than otherwise (Fig. 4g). This trend is especially obvious when the p65-binding sites are located far from the linked promoters. Based on this observation, we conclude that pre-existing DNA-looping interactions between enhancers and promoters allow a ubiquitous, signal-dependent transcription factor to affect a selected set of genes in a cell type specific manner.
In summary, we demonstrated that enhancer/promoter interactions already form in each cell type prior to the binding of signal dependent transcription factors, and they undergo little change during transient transcriptional activation. Several recent genome-wide studies have revealed that in different cell type, the repertoire of specific enhancers provides a unique context for the activation of different transcriptional programs in response to signal dependent transcription factors including NF-κB19,27–30. Here, our results further suggest that targets of cell-specific enhancers are already hardwired into the chromatin architecture. We therefore propose that cell type-specific looping structure, by controlling the accessibility of the enhancers to their specific targets, may form an additional layer of regulation in determining the distinct transcription programs in different cell types.
Hi-C experiments were performed in human primary IMR90 fibroblasts. ChIP-Seq, GRO-Seq, or RNA-seq libraries were also generated from IMR90, HUVEC, MCF7 or LNCaP cells and sequenced on the Illumina Hi-Seq2000 platform. All the reads were mapped to reference human genome (hg18). More information about the experiments and detailed descriptions of Hi-C data analysis pipeline, including data filtering, normalization, statistical modeling and interaction calling can be found in Supplementary Methods.
Supplementary Information is linked to the online version of the paper at www.nature.com/nature
Author Contributions: YL, FJ and BR designed the studies. YL conducted most of the experiments; FJ carried out the data analysis; JRD, ZY, AYL, CY, ADS and CE contributed to the experiments; SS contributed to the data analysis; FJ, YL and BR prepared the manuscript.
Author Information: All sequencing data described in this study have been deposited to GEO under the accession number GSE43070. Some sequencing data used in this study were previously published and accession numbers can be found in Supplementary Methods. All chromatin interactions called in IMR90 cells can be found in Supplementary Data.
We thank Christopher K Glass for sharing the GRO-Seq protocol, Samantha Kuan and Lee Edsall for assistance with high-throughput DNA sequencing and the initial processing. This work is supported by funds from the Ludwig Institute for Cancer Research, the California Institute of Regenerative Medicine (RN2-00905) and US National Institutes of Health (P50 GM085764-03).
- 1. An integrated encyclopedia of DNA elements in the human genomeNature20124895774[PubMed][Google Scholar]
- 2. Systematic localization of common disease-associated variation in regulatory DNAScience (New York, N.Y201233711901195[Google Scholar]
- 3. Genome organization and long-range regulation of gene expression by enhancersCurrent opinion in cell biology201325387394[PubMed][Google Scholar]
- 4. Exploring the three-dimensional organization of genomes: interpreting chromatin interaction dataNature reviews201314390403[Google Scholar]
- 5. Comprehensive mapping of long-range interactions reveals folding principles of the human genomeScience (New York, N.Y2009326289293[Google Scholar]
- 6. Iterative correction of Hi-C data reveals hallmarks of chromosome organizationNature methods201299991003[PubMed][Google Scholar]
- 7. A decade of 3C technologies: insights into nuclear organizationGenes & development2012261124[PubMed][Google Scholar]
- 8. DNase I-hypersensitive exons colocalize with promoters and distal regulatory elementsNature genetics2013[Google Scholar]
- 9. The dynamic architecture of Hox gene clustersScience (New York, N.Y2011334222225[Google Scholar]
- 10. Disruption of a long-range cis-acting regulator for Shh causes preaxial polydactylyProceedings of the National Academy of Sciences of the United States of America20029975487553[PubMed][Google Scholar]
- 11. Spatial organization of the mouse genome and its role in recurrent chromosomal translocationsCell2012148908921[PubMed][Google Scholar]
- 12. Topological domains in mammalian genomes identified by analysis of chromatin interactionsNature2012485376380[PubMed][Google Scholar]
- 13. Three-dimensional folding and functional organization principles of the Drosophila genomeCell2012148458472[PubMed][Google Scholar]
- 14. Distinct epigenomic landscapes of pluripotent and lineage-committed human cellsCell stem cell20106479491[PubMed][Google Scholar]
- 15. The long-range interaction landscape of gene promotersNature2012489109113[PubMed][Google Scholar]
- 16. Histone H3K27ac separates active from poised enhancers and predicts developmental stateProceedings of the National Academy of Sciences of the United States of America20101072193121936[PubMed][Google Scholar]
- 17. Dynamic chromatin states in human ES cells reveal potential regulatory sequences and genes involved in pluripotencyCell research20112113931409[PubMed][Google Scholar]
- 18. A unique chromatin signature uncovers early developmental enhancers in humansNature2010470279283[PubMed][Google Scholar]
- 19. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identitiesMolecular cell201038576589[PubMed][Google Scholar]
- 20. Architectural Protein Subclasses Shape 3D Organization of Genomes during Lineage CommitmentCell201315312811295[PubMed][Google Scholar]
- 21. Chromatin compaction by a polycomb group protein complexScience (New York, N.Y200430615741577[Google Scholar]
- 22. The accessible chromatin landscape of the human genomeNature20124897582[PubMed][Google Scholar]
- 23. Enhancer function: new insights into the regulation of tissue-specific gene expressionNature reviews201112283293[Google Scholar]
- 24. The transcriptional interactome: gene expression in 3DCurrent opinion in genetics & development201020127133[PubMed][Google Scholar]
- 25. eRNAs are required for p53-dependent enhancer activity and gene transcriptionMolecular cell201349524535[PubMed][Google Scholar]
- 26. Integration of regulatory networks by NKX3-1 promotes androgen-dependent prostate cancer survivalMolecular and cellular biology201232399414[PubMed][Google Scholar]
- 27. PU.1 and C/EBP(alpha) synergistically program distinct response to NF-kappaB activation through establishing monocyte specific enhancersProceedings of the National Academy of Sciences of the United States of America201110852905295[PubMed][Google Scholar]
- 28. Chromatin accessibility pre-determines glucocorticoid receptor binding patternsNature genetics201143264268[PubMed][Google Scholar]
- 29. Master transcription factors determine cell-type-specific responses to TGF-beta signalingCell2011147565576[PubMed][Google Scholar]
- 30. Enhancers: multi-dimensional signal integratorsTranscription20112226230[PubMed][Google Scholar]