Recent Research

In recent years, we have developed new chromatin profiling tools both to enable our continuing studies of chromatin dynamics and to lower the bar to entry into the increasingly high-tech field of epigenomics. My lab has pioneered enzyme-tethering strategies beginning with DamID1 in 2000, re-emerging with ChEC-seq2 in 2015, followed by CUT&RUN3 in 2017, CUT&Tag4 in 2019, CUTAC5 in 2020 and CUT&Tag2for1 and MulTI-Tag6 in 2021, including adaptation of these technologies to full automation7,8 and scalable single-cell chromatin profiling9. Our technological advances have led us to a deeper understanding of the competition between transcription factors and nucleosomes behind the replication fork10,11, of nucleosomal unwrapping intermediates driven by RNA Polymerase II (RNAP2)12, of transcription factor “pioneering”13, of nucleosome depletion by SWI/SNF remodelers14 and RNAP215, and of chromatin deregulation in cancer716, 17, 18, 19. In the meantime, our CUT&RUN/Tag technologies have already been adopted by several hundred laboratories and have fueled development of several commercial products and kits. My laboratory's outreach activities include our COVID-responsive CUT&Tag@home project20 chosen by The Scientist magazine as one of the top technical advances of 202021.

Work in the lab continues to be sharply focused on protein-DNA interaction dynamics, “where the rubber hits the road”. Our technology development efforts remain focused on better ways of addressing dynamics at the interfaces between DNA and the proteins and protein complexes involved in these fundamental genetic processes, with the expectation that our methods will continue to be adopted by researchers working on diverse biological problems22.

ChEC-seq -> CUT&RUN -> CUT&Tag -> CUTAC -> CUT&Tag2for1 and MulTI-Tag

figure 1

Figure 1: Differences between ChIP-seq, CUT&RUN and CUT&Tag3.

ChEC-seq: The dominant technology for mapping protein-DNA interactions has been Chromatin Immunoprecipitation (ChIP) since the mid-1980s23, with successive advances in read-out platforms including PCR, microarrays and ‘Next-generation’ DNA sequencing. ChIP begins with solubilization of chromatin, typically by sonication of cross-linked cells, followed by immune precipitation with an antibody (Figure 1a). In contrast, enzyme tethering methods are performed in intact cells or nuclei. For example, to perform Laemmli’s ChEC (Chromatin Endogenous Cleavage) method24, targeted chromatin proteins are fused to Micrococcal Nuclease (MNase), which is activated in situ by addition of Ca++ to permeabilized cells, taking advantage of the base-pair resolution possible with MNase. In 2015 we adapted ChEC for genome-wide application (ChEC-seq)2. ChEC-seq has since gained popularity in the yeast chromatin field, where strains are available with epitope tags on chromatin proteins and TFs. ChEC-seq projects moved to the laboratory of former postdoc Gabriel Zentner, Assistant Professor at Indiana University.25

CUT&RUN: The success of ChEC-seq encouraged us to adapt Laemmli’s ChIC (Chromatin Immuno-Cleavage) method24. ChIC uses a protocol similar to that of ChEC, except that instead of tethering MNase via a fusion to the target protein, the enzyme is purified as a fusion with Protein A, which binds avidly to most antibodies. (Figure 1b3). In addition, our simple workflow using magnetic beads makes the method suitable for full automation. Post-doc Peter Skene, who helped develop CUT&RUN, left the lab in 2017 and is currently Director of Molecular Biology and Biochemistry at the Allen Institute of Immunology.

CUT&Tag: Post-doc Hatice Kaya-Okur modified the CUT&RUN protocol to utilize a Protein A-Tn5 (pA-Tn5) fusion protein, where Tn5 is the cut-and-paste transposase used in Illumina’s Nextera system and in ATAC-seq26 (Figure 1c). CUT&Tag (Cleavage Under Targets & Tagmentation)4 eliminates the library preparation step, while providing even lower backgrounds than CUT&RUN. Our outreach efforts over the past two years have helped to gain rapid acceptance for CUT&Tag in both academia and industry. We later simplified CUT&Tag29 so that all steps from nuclei to sequencing-ready libraries are performed in single PCR tubes in a day or on a general-purpose robot7. In 2019, Hatice Kaya-Okur joined the Altius Institute for Biomedical Sciences as an Altius Scholar.

figure 2

Figure 2: a) CUTAC tethers Tn5 to either H3K4me25 or RNA Pol2 Ser5 phosphate15, where the active site of Pol2 is ~130 bp from accessible “open” chromatin sites genome-wide. b) CUT&Tag2for1 deconvolves CUTAC signals from Pol2S5p and H3K27me3 mixtures based on fragment size and density. c) MulTI-Tag sequentially binds and tagments barcoded-antibody/Tn513.

CUTAC: ChEC-seq, CUT&RUN and CUT&Tag were expressly intended to replace ChIP-seq, however, our novel chromatin accessibility profiling method (CUTAC for Cleavage Under Targeted Accessible Chromatin) began with a serendipitous observation during my CUT&Tag@home project. When using an antibody to H3K4me2, which marks nucleosomes flanking both promoters and enhancers, and performing the Mg++-catalyzed tagmentation step in low salt, I noticed that antibody-tethered Tn5 integrated into accessible chromatin sites genome-wide5. Chromatin accessibility mapping using CUTAC is specific for H3K4me2 and H3K4me3, but not for any other tested histone modification, and for RNA Polymerase II Serine-5 phosphate (Pol2S5p)15 (Figure 2a). The distribution of CUTAC peaks closely resembles that of ATAC-seq peaks, with especially high signal-to-noise attributable to tethering of Tn5 to nearby epitopes, with Pol2S5p-CUTAC providing the best accessible chromatin data quality. The close correspondence between Pol2S5p-CUTAC and chromatin accessibility mapping implies that accessibility is coupled to Pol2 pausing, and that promoters and enhancers share the same basic chromatin configuration30.

CUT&Tag2for1: Technological progress in single-cell read-out technologies has fueled interest in “Multi-OMICs” where two different modalities, such as RNA-seq and ATAC-seq are performed in the same cells. However, multimodal methods require complicated workflows and deconvolution methods to take advantage of multiple modalities for cell-state identification. Pol2S5p-CUTAC releases mostly subnucleosome-sized fragments corresponding to peaks of TF binding at promoters and enhancers, whereas H3K27me3-marked nucleosomes are in broad domains. We have taken advantage of these differences to use two antibodies for CUT&Tag simultaneously and then use a Bayesian deconvolution strategy to computationally separate active regulatory sites from developmentally silenced Polycomb domains based on fragment size and feature width (Figure 2b). The high efficiency of CUTAC using these two antibodies in a mixture has allowed us to profile the active and repressive regulomes in the same single cells (Janssens, D.H., Wu, S.J., Meers, M., Ahmad, K., Henikoff, J.G. & Henikoff, S. Simple multi-factorial chromatin profiling for active and repressive regulomes, Genome Biology, 2022).

MulTI-Tag: Post-doc Michael Meers has developed Multiple Targets Identified via Tagmentation (MulTI-Tag), a CUT&Tag-based approach that uses identifying barcodes to profile multiple chromatin-associated proteins in the same individual cells6 (Figure 2c). MulTI-Tag is as efficient as single-antibody CUT&Tag both in bulk and in single cells and represents a landmark advance in single cell chromatin profiling. Mike will be taking the MulTI-Tag project with him to continue his pioneer factor project13 (described below) when he starts his own lab, while we will continue using CUTAC and CUT&Tag2for1 in our research.

figure 3

Figure 3: Metabolic labeling with EdU followed by ‘click’ chemistry to attach biotin, MNase digestion, streptavidin pulldown and MINCE-seq library preparation maps newly replicated chromatin11

Nucleosome dynamics behind the replication fork

Every nucleosome across the genome must be disrupted and reformed when the replication fork passes, but how chromatin organization is re-established following replication was unknown. To address this problem, post-doc Srinivas Ramachandran developed a metabolic labeling method using 5-Ethynyl-2'-deoxyuridine (EdU) uptake followed by MNase-seq and ‘click’ chemistry to characterize the genome-wide location of nucleosomes and other chromatin proteins behind replication forks at high temporal and spatial resolution11 (Figure 3). We found that the characteristic chromatin landscape at Drosophila promoters and enhancers is lost upon replication. The most conspicuous changes are at promoters that have high levels of RNAP2 stalling and DNA accessibility and show specific enrichment for the BAF (Brahma-associated factor) remodeler complex. Enhancer chromatin is also disrupted during replication, suggesting a role for transcription factor (TF) competition in nucleosome re-establishment. Thus, the characteristic nucleosome landscape emerges from a uniformly packaged genome by the action of TFs, RNAP2, and remodelers minutes after replication fork passage.

figure 4

Figure 4: Transcription produces asymmetrically unwrapped nucleosomal intermediates12.

Nucleosome disruption during transcription

Nucleosomes are disrupted during transcription, but the structural intermediates during nucleosome disruption in vivo had been unknown. To identify transcriptional intermediates, Srinivas Ramachandran mapped subnucleosomal protections in Drosophila cells using MNase-seq and CUT&RUN. At the first nucleosome position downstream of the transcription start site, we identified unwrapped intermediates, including hexasomes that lack either proximal or distal contacts. Inhibiting topoisomerases or depleting histone chaperones increased unwrapping, whereas inhibiting release of paused RNAP2 or reducing RNAP2 elongation decreased unwrapping (Figure 4). Our results indicated that positive torsion generated by elongating RNAP2 causes transient loss of histone-DNA contacts. Using this “structural epigenomics” approach, we found that nucleosomes flanking human CTCF insulation sites are similarly disrupted.

We also identified diagnostic subnucleosomal particle remnants in cell-free human DNA data as a relic of transcribed genes from apoptosing cells. Thus identification of subnucleosomal fragments from nuclease protection data represents a general strategy for structural epigenomics. Cell-free DNA and structural epigenomics projects have moved to the lab of former post-doc Srinivas Ramachandran, who is currently Assistant Professor at U. Colorado HSC.

figure 5

Figure 5: Both H3K4me2 and Pol2Ser5p CUTAC robustly correspond to ENCODE ATAC-seq sites relative to the best available ATAC-seq data (Omni-ATAC).

Our discovery that RNAP2S5p-CUTAC maps chromatin accessibility provides direct evidence that paused RNAP2 is engaged immediately adjacent to ATAC-seq and DNase-seq peaks at enhancers and promoters genome-wide (Figure 5). CUTAC replaces the “open chromatin” metaphor for gene regulatory elements based on unrelated enzymatic and physical accessibility assays with a rigorous definition based on the well-established role of Pol2 pausing in gene regulation5,15.


figure 6

Figure 6: Model for chromatin dynamics at yeast promoters14.

Dynamic maintenance of nucleosome-depleted regions by SWI-SNF remodelers

The classic view of nucleosome organization at active promoters is that two well-positioned nucleosomes flank a nucleosome-depleted region (NDR). However, this view has been disputed by contradictory reports as to whether wider (>150 bp) NDRs instead contain unstable, micrococcal nuclease-sensitive (‘fragile’) nucleosomal particles. To determine the composition of fragile particles, post-doc Sandipan Brahma applied CUT&RUN.ChIP, in which targeted nuclease cleavage and release is followed by chromatin immunoprecipitation. He found that fragile particles represent the occupancy of the RSC (Remodeling the Structure of Chromatin) complex and RSC-bound, partially unwrapped nucleosomal intermediates. Sandipan also found that general regulatory factors (GRFs) bind to partially unwrapped nucleosomes at these promoters. We proposed that RSC binding and its action cause nucleosomes to unravel, facilitate subsequent binding of GRFs, and constitute a dynamic cycle of nucleosome deposition and clearance at the subset of wide Pol2 promoter NDRs (Figure 614). Remodeler-specific projects will move to the lab of Sandipan Brahma, currently a K99/R00 post-doctoral fellow.

figure 7

Figure 7: Using CUT&RUN to directly test genome-wide nucleosome binding predicted by the pioneer factor hypothesis13. Top: Experimental scheme; Bottom: During differentiation 29% of FoxA2 sites show pioneering, but only a handful of sites for the CTCF control.

Pioneer factor binding in vivo

Although the in vitro structural and in vivo spatial characteristics of TF binding are well defined, TF interactions with chromatin and other companion TFs during development were poorly understood. To analyze such interactions in vivo, Post-doc Michael Meers used CUT&RUN to profile several TFs across a time course of human embryonic stem cell differentiation, and studied their interactions with nucleosomes and co-occurring TFs by Enhanced Chromatin Occupancy (EChO), a computational strategy for classifying TF interactions with chromatin (Figure 713). EChO showed that multiple individual TFs can employ either direct DNA binding or “pioneer” nucleosome binding at different enhancer targets. Nucleosome binding is not exclusively confined to inaccessible chromatin, but rather is correlated with local binding of other TFs, and with degeneracy at key bases in the pioneer factor target motif responsible for direct DNA binding. Our strategy revealed a dynamic exchange of TFs at enhancers across developmental time that is aided by pioneer nucleosome binding. Pioneer factor projects will move to the lab of Michael Meers, currently a K99/R00 post-doctoral fellow.

figure 8

Figure 8: a) A Drosophila retina development model for H3.3K27M-driven pediatric glioblastoma shows inhibition (red cells) behind the morphogenetic furrow (yellow arrow), but not when the cell cycle is inhibited by p2116b) Based on our Drosophila and glioma cell line evidence we explain the differences between replication-independent (RI, H3.3) and replication coupled (RC, H3.2) inhibition of PRC2 in terms of the different histone deposition pathways for these two histone variants.

Deregulation of nucleosome dynamics in cancer

In 2002, then-post-doc Kami Ahmad discovered that the three histone fold domain amino acids that distinguish the conserved histone variant, H3.3, from canonical H3 (H3.1/H3.2 in humans) specify replication-independent (H3.3) versus replication-coupled (H3.1/3.2) nucleosome assembly33. His Drosophila cytological study using GFP-labeled histones also revealed that H3.3 incorporated genome-wide at active chromatin, including the active but not the inactive rDNA loci. Subsequent work from many groups built on Kami’s findings by molecular characterization of dedicated chaperones and other features of the two pathways34. In 2012, the first “oncohistones” were discovered in pediatric diffuse midline gliomas (DMGs) characterized by lysine 27-to-methionine (K27M) mutations in either H3.3 or H3.135,36. These oncohistone mutations dominantly inhibit histone H3K27 trimethylation and silencing, but it was unknown how oncohistone type affected gliomagenesis. Again using Drosophila as a model, Kami, now a Principal Investigator in the adjacent laboratory, demonstrated that inhibition of H3K27 trimethylation occurs only when H3K27M oncohistones are deposited into chromatin and only when expressed in cycling cells (Figure 8a). Using CUT&RUN on human DMG cell lines, post-doc Jay Sarthy showed that the genomic distributions of H3.1 and H3.3 oncohistones in human patient-derived DMG cells are consistent with the DNA replication-coupled deposition of histone H3.1 and the predominant replication-independent deposition of histone H3.3. Although H3K27 trimethylation is reduced for both oncohistone types, H3.3K27M-bearing cells retain some domains, and only H3.1K27M-bearing cells lack H3K27 trimethylation. We proposed that oncohistones inhibit the H3K27 methyltransferase as chromatin patterns are being duplicated in proliferating cells, predisposing them to tumorigenesis (Figure 8b).

In a follow-up study, Kami Ahmad used H3K27M inhibition of PRC2 in fly imaginal discs to show that simultaneous misexpression of a master regulatory TF and H3K27M results in an overexpression phenotype reminiscent of oncogenesis37.

In other collaborative work, Jay Sarthy found that the testes-specific histone variant, H2A.B, is significantly over-expressed in a variety of tumors, including about half of all diffuse large B-cell lymphomas17. These first examples of “ready-made” oncohistones, which are potentially oncogenic without a mutation, wrap less DNA than canonical H2A and destabilize nucleosomes in vivo38. Oncohistone cancer projects will move to the lab of Jay Sarthy, a pediatric oncologist who will soon move on to a tenure-track faculty position.

figure 9

Figure 9: Two non-exclusive models for how non-B DNA specifies centromeres throughout the eukaryotic kingdom46.

Understanding centromere evolution

In a series of four publications from 2015-2018, post-doc Jitendra Thakur developed a research program based on application of our chromatin profiling advances to centromeric chromatin. We first applied our native MNase ChIP-seq method39 to human centromeric (CENP-A) nucleosomes, and showed that specific dimeric α-satellite units shared by multiple individuals dominate functional CENP-bound human centromeres40, recently confirmed by the Telomere-to-Telomere Consortium41. We also used native MNase ChIP-seq to delineate the centromeric complexes at fission yeast ‘regional’ centromeres42 and human ‘satellite’ centromeres43. Our evidence suggested that fission yeast centromeres are wrapped by CENP-A nucleosomes and CENP-T nucleosome-like particles in a dispersed non-sequence-specific manner42. In contrast, human CENP-A and CENP-T orthologs are part of a coherent CENP-A/CENP-B/CENP-C/CENP-T (“CCAN”) complex at α-satellite dimers that comprise the fundamental unit of centromeric chromatin43. Salt-fractionation applied to native MNase ChIP-seq and CUT&RUN revealed that the occupancy of the CCAN complex is highly variable, even for α-satellite dimers that are targeted by the sequence-specific CENP-B protein44. Centromere projects begun by Jitendra are being continued in her own lab at Emory University where she is currently an Assistant Professor. Jitendra’s lab will also continue the CUT&RUN.RNase45,46 projects that she developed as a postdoc.

Meanwhile, MSTP student Siva Kasinathan discovered that there are two determinants of centromere sequence specificity: CENP-B and predicted short non-B DNA foldback elements. Siva found that non-B DNA is a characteristic of centromeres and neocentromeres throughout the eukaryotic kingdom in organisms that lackCENP-B47 (Figure 9). Siva’s evidence that non-B DNA is a general property of centromeres challenged the dogma of “epigenetic” centromeres while also rekindling our interest in what Harmit Malik, Kami Ahmad and I had termed the “centromere paradox”, stable inheritance despite rapid evolution of DNA and centromere proteins. At the time we explained the centromere paradox by invoking centromere drive during female meiosis, which has gained general acceptance over the past 20 years48,49. However, the molecular basis for rapid evolution of centromere satellites remains speculative. For instance, using native MNase ChIP-seq we found that the functional centromeres of D. simulans include complex satellite families that are entirely absent from the genome of a sibling species Drosophila melanogaster.50 New light on the molecular mechanism of satellite DNA evolution derives from evidence that break-induced repair (BIR) replication underlies centromere drive51(Talbert, P.B. & Henikoff, S. The genetics and epigenetics of satellite centromeres. Genome Res. in review, 2021).