Modified MNase-seq: Methods that have been used to prepare chromatin for profiling by chromatin immunoprecipitation (ChIP) typically have involved sonication, limiting resolution to a few hundred base pairs. Micrococcal nuclease (MNase) cleaves double stranded DNA, and then “nibbles” on the exposed ends until it reaches an impediment such as a nucleosome. The DNA fragments protected from MNase by the nucleosome can be recovered, sequenced, and aligned to known sequence to give a precise map of nucleosome positions. By substituting the typical mononucleosome-sized gel-purification step of MNase-seq with an Ampure bead clean-up, and preparing paired-end sequencing libraries, we adapted a method for nucleosome mapping into one that is suitable for all DNA-binding proteins1. We applied this simple protocol to budding yeast and Drosophila, and mapped both nucleosomes and subnucleosome-sized particles at base-pair resolution throughout the genome. To characterize DNA-binding, we introduced the “V-plot”, a dot-plot of the length of each fragment versus its position in the genome. This procedure provides base-pair-resolution mapping of protected and exposed regions at and around binding sites, and also determination of the degree to which they are flanked by phased nucleosomes and subnucleosome-sized particles. Separation of nucleosomes from other chromatin proteins based on fragment size provides a general strategy for complete epigenome characterization from a single sample.
ORGANIC profiling: We next added chromatin affinity-purification to our MNase-seq protocol for a general native ChIP method, which we later termed “ORGANIC” (for Occupied Regions Genomes from Affinity-purified Naturally Isolated Chromatin), and applied this method to the budding yeast centromere2,3. Relative to the Cse4 (yeast CenH3) X-ChIP-seq dataset from the Snyder group published in 2009, we observed a 2-order-of-magnitude improvement in both resolution (~300 bp to 1-2 bp) and dynamic range (200x to 25,000x). Importantly, we were able to delineate the known tripartite organization of the ~120-bp yeast centromere in terms of the epigenome profile, with the centromeric nucleosome entirely confined within the ~80-bp CDEII (Centromere DNA Element II). In contrast, the X-ChIP data placed Cse4 maxima just outside of the functional centromere, perhaps an example of the “hyperChIPpable” artifact recently described for this and other standard X-ChIP budding yeast datasets by Teytelman et al. (2013)4.
We next turned our attention to the SWI/SNF family of ATP-dependent DNA translocases, abundant but dynamic components of the epigenome, and often difficult to map by ChIP. Using the ORGANIC protocol, postdoc Gabe Zentner mapped the genomic distributions of the yeast Isw1, Isw2, and Chd1 remodelers5. Although these remodelers act in gene bodies, we found that they are also highly enriched at nucleosome-depleted regions (NDRs), where they bind to extended regions of DNA adjacent to particular transcription factors. Surprisingly, catalytically inactive remodelers showed similar binding patterns. We found that remodeler occupancy at NDRs and gene bodies is associated with nucleosome turnover and transcriptional elongation rate, suggesting that remodelers act on regions of transient nucleosome unwrapping or depletion within gene bodies subsequent to transcriptional elongation. Further evidence for the applicability of ORGANIC profiling to SWI/SNF family members came from our demonstration that Mot1, which mobilizes TATA-binding protein (TBP), approaches TBP from upstream6, thus providing in vivo confirmation of the Mot1/TBP structural model7. Work on remodelers and TBP will be continued by post-doc Gabriel Zentner, who started his own lab at Indiana University as Assistant Professor of Biology.
The ultimate demonstration of the power of our ORGANIC approach came from using it to successfully map transcription factors far more accurately than any previous method. It has been widely assumed that the binding of a transcription factor (TF) with DNA requires covalent fixation in order to avoid detachment during affinity purification. However, the classical 1985 study of Solomon and Varshavsky that is often cited as a precursor to ChIP showed that extended fixation with formaldehyde was unable to fix the lac repressor to the lac operator in vitro even after long incubation8. Indeed, the extent to which formaldehyde fixation covalently fixes TFs to DNA as opposed to the nearest nucleosome is not clear: for example, it has been estimated that ~2/3 of the TF sites mapped by the ENCODE project are bound indirectly to the TF, not directly to the motif that the TF has evolved to bind9. We reasoned that high-affinity binding of a TF should persist under the low salt conditions that we use for MNase digestion of intact nuclei, and that in any case, diffusion to exposed DNA during or after MNase treatment is unlikely because of rapid removal of exposed DNA by the nuclease. Indeed, when graduate student Siva Kasinathan applied ORGANIC profiling to budding yeast TFs, he found that direct TF-chromatin interactions were mapped at high resolution and with high specificity and sensitivity10. De novo motif discovery revealed that the large majority of ORGANIC binding site calls have the expected consensus motif and correspond to DNase I footprints, suggestive of direct in vivo binding. Our study also demonstrated the flexibility of ORGANIC profiling in mapping genomic binding sites of proteins with structurally distinct DNA-binding domains from different species and showed that the specificity of ORGANIC maps can be modulated by varying salt concentration. In a direct comparison between ORGANIC and ChIP-exo11, the only previous base-pair resolution method, we found that almost twice as many ORGANIC yeast Reb1 sites contained the known Reb1 motif, comprising 99.3% of the total, whereas the Reb1 motif was found in only ~60% of total ChIP-exo sites, and 98% of primary sites. The simplicity and high efficiency of ORGANIC profiling makes it well-suited for large-scale profiling of TF occupancy.
High Resolution X-ChIP: Although ORGANIC profiling provides exceptionally high dynamic range, it is less suitable for proteins that are difficult to solubilize, such as RNAPII. Accordingly, we introduced a simple modification of X-ChIP-seq that proved superior to standard X-ChIP and ChIP-exo in resolution, specificity and efficiency12. Using a dominant negative approach, we have applied this protocol to delineate the different functions of the mouse Chd1 remodeler at promoters and within gene bodies. We found that RNAPII requires Chd1 for promoter escape13. Thus, positive torsion, H2A.Z and Chd1 contribute to overcoming nucleosome barriers. Our recent finding that the yeast RSC nucleosome remodeler partially unwraps the +1 nucleosome14 further focuses our attention on how transcription-induced torsion, H2A.Z, and nucleosome remodeling interact at the +1 nucleosome to disrupt nucleosomes and modulate gene expression. The simplicity and high efficiency of our new ChIP-seq methods should enable small labs to obtain high-resolution profiles far superior to those available from ENCODE and other epigenomics mega-projects.
ChEC-seq: Postdoc Gabe Zentner, now at Indiana University, modified Chromatin Endogenous Cleavage (ChEC) for a DNA sequencing read-out (ChEC-seq)15. ChEC uses fusion of a protein of interest to MNase to target calcium-inducible cleavage in intact cells. Acquisition of ChEC-seq data on a seconds-to-minutes time-scale revealed two classes of sites for yeast TFs, one displaying rapid cleavage close to one side of consensus motifs and the second showing slow cleavage at non-motif sites. Remarkably, fast and slow sites showed nearly identical DNA shape (minor-groove width, helical twist, propeller twist and roll) profiles, which implies that time-resolved ChEC-seq detects both high-affinity interactions of TFs with consensus motifs and low-affinity sites preferentially sampled by TFs during scanning for DNA shape features. In our initial study, we had relied on binding-motif alignment of fast and slow sites, and slow sites were enriched in sites near fast sites16. A re-analysis by graduate student Siva Kasinathan17 found that fast and slow sites aligned solely by shape rather than motif and selected to be at least 100-500 bp away from the nearest TF ChEC-seq maximum also showed strong correlation of shape between fast and slow sites, but not with random sites or free MNase sites.
With collaborators Sebastian Grünberg and Steven Hahn18, we tethered MNase to the Med8 and Med17 subunits of the head module of Mediator, a complex 25 subunit transcription co-activator whose genomic binding sites have been difficult to resolve by conventional ChIP. In the resultant ChEC-seq data, we found that Mediator associates with the majority of upstream activating sequences in the yeast genome, rather than with core promoters or gene bodies, and occupancy is only partially correlated with transcription levels. The distance from Mediator to the transcription start site differed for SAGA-dependent and TFIID-dependent promoters. Mediator recruitment was partially dependent on the Taf1 subunit of TFIID at both TFIID and SAGA promoters, and conversely, ChEC-seq peaks with Mnase-tethered Taf1 at the core promoter were partially dependent on Mediator. ChEC-seq is a simple, efficient method with high spatio-temporal resolution and orientation sensitivity that we anticipate will be broadly applicable for genome-wide profiling of protein-DNA dynamics.
CUT&RUN: We have recently developed a widely applicable chromatin profiling method called CUT&RUN (cleavage under target and release using nuclease)19. Here intact permeabilized cells are incubated with an antibody against a transcription factor or histone modification, after which, a protein A-MNase fusion directs the cleavage of chromatin footprint. As opposed to chromatin immunoprecipitation (ChIP), CUT&RUN is performed in situ in the absence of formaldehyde crosslinking without total genome fragmentation and solubilization. This allows high resolution footprints of protein-DNA interactions with minimal background, so much so that we can profile abundant histone modifications with as few as 100 cells.
We first used CUT&RUN to map the budding yeast transcription factors Abf1 and Reb1 and compared the results with ORGANIC profiles10. We found close correspondence of CUT&RUN sites to ORGANIC sites with only 10% of the number of paired-end reads. CUT&RUN mapped Abf1 and Reb1 sites to a 20 bp footprint with near-base pair resolution. In addition to releasing small transcription-factor-bound fragments, CUT&RUN also releases larger fragments protected by adjacent nucleosomes, allowing limited probing of chromatin structure around the target site. CUT&RUN targeted to larger chromatin complexes, such as the Sth1 component of the RSC complex, gave results similar to ORGANIC profiling, but with much larger yields. By comparing soluble and insoluble fractions we were able to map the insoluble kinetochore using either the centromere-specific histone H3 (Cse4) or even the insoluble fraction of H2A, which is maximal at the centromeres.
In human K562 cells, Pete Skene applied CUT&RUN to transcription factors CCCTC-binding factor (CTCF), Myc and Max. Binding sites were highly concordant with previous X-ChIP from the ENCODE project or our own data12, but gave higher dynamic range. Similar results were obtained targeting domains of the repressive chromatin mark H3K27me3, which can serve as a convenient positive control in CUT&RUN experiments. Concordance with cross-linking ChiP suggests that like X-ChIP, CUT&RUN may identify indirect sites brought into contact with direct sites through protein-protein contacts. CUT&RUN detected all CTCF sites detected by native ChIP, which detects only direct sites, but also about ten times as many CTCF sites also present in X-ChIP that are presumably from indirect contacts. To confirm this we compared CUT&RUN sites to CTCF ChiA-PET contact sites, finding all high-scoring ChiA-PET fragments overlap direct and indirect CUT&RUN sites. Thus comparing CUT&RUN with native ChIP can distinguish direct and indirect contacts at near base pair resolution.
We have continued to improve and adapt our protocol. Derek Janssens automated CUT&RUN (AutoCUT&RUN) so that a 96 well format can be used to profile chromatin for high-throughput samples such as in a clinical setting20. AutoCUT&RUN was used for cell-type specific gene activity and enhancer profiling of the human embryonic stem cell line H1 and the leukemia cell line K562, based on histone modifications and transcription factors that closely correlate with transcriptional activity. Auto CUT&RUN was also applied to frozen tissue samples of tumor xenografts, which were highly correlated with the diffuse midline glioma cell lines from which they derived, while different tumor subtypes showed differential activity at 5000 promoters, suggesting that frozen samples can be used to distinguish tumor subtypes20.
Terri Bryson constructed a 6xHis and HA-tagged protein A-protein G-MNase fusion (pAG-MNase) that allows direct binding of mouse antibodies that bind poorly to protein A, eliminating the need for a secondary antibody21. The His tag allows purification of pAG-MNase with a commercial kit, while the HA tag can be used for pulling out pAG-MNase chromatin complexes for CUT&RUN.ChIP. Mike Meers developed low salt, high calcium conditions that prevent diffusion of released complexes into the supernatant, allowing for longer digestion times and increased yields without increased cleavage at non-specific accessible sites. Mike also developed Sparse Enrichment Analysis for CUT&RUN (SEACR),a peak-calling algorithm that takes advantage of position and fragment spanning information in CUT&RUN data to improve peak-calling and domain detection.
CUT&Tag: Hatice Kaya-Okur developed a similar enzyme tethering strategy for chromatin profiling of histone modifications, transcription factors, and RNA polymerase II, using a fusion of protein A with Tn5 transposase pre-loaded with sequencing adaptors22. Activation of the transposase within K562 cells ligates adaptors to chromatin fragments around the antibody to which the transposase is tethered, and amplification of fragments generates sequencing-ready libraries with high resolution and very low background. CUT&Tag using antibody to engaged RNA polymerase II was validated by close correspondence of CUT&Tag peaks to Pro-Seq data that maps the 5’end of RNA polymerase II transcripts by an independent methodology. Using antibodies to the transcription factor NPAT, which binds to the histone genes, or to the abundant factor CTCF, transcription factor binding could be easily distinguished from non-specific cutting at accessible chromatin, as determined by ATAC-Seq, by the vast difference in read-coverage between known binding sites and ATAC-Seq sites. Hatice adapted CUT&Tag for single cell profiling of H3K27me3 by using gentle centrifugation to collect bulk cells between steps rather than Concanavalin A magnetic beads, followed by dispensing single cells into nanowells of a 5184-well chip using the ICELL8 nano-dispensing system. Despite sparse data, the individual cell profiles fall within the H3K27me3 domains defined by bulk profiling.