Featured Technologies

Single Cell Transcriptomics

Single Cell Transcriptomics

Single cell RNA-sequencing (scRNA-seq) can be used to dissect transcriptomic heterogeneity that is masked in population-averaged measurements. We validated a fully-integrated and robust droplet-based system that enables 3’ mRNA digital profiling of thousands of single cells in a highly multiplex fashion. We demonstrate the clinical utility of our technology to characterize both immune cell subtypes and genotypes by integrating single cell digital RNA profiling with de novo single nucleotide variant (SNV) calling.

Random Mutation Detection

Random Mutation Detection

To permit the measurement of spontaneous and induced nuclear and mitochondrial mutations, we developed the digital Random Mutation Capture assay (dRMC). The dRMC permits the analysis of millions of nucleotides, and can identify one mutant base pair among 109 wild-type base pairs. In our approach, enrichment for mutant mtDNA with restriction endonucleases precedes single molecule amplification, effectively eliminating issues with polymerase fidelity.

Single Molecule Sequencing

Single Molecule Sequencing

Next-generation sequencing (NGS) technologies have transformed genomic research and have the potential to revolutionize clinical medicine. However, the background error rates of sequencing instruments and limitations in targeted read coverage have precluded the detection of rare DNA sequence variants by NGS. We developed a method, termed CypherSeq, which combines double-stranded barcoding error correction and rolling circle amplification (RCA)-based target enrichment to vastly improve NGS-based rare variant detection.

Digital T-Lymphocyte Counting

Digital T-Lymphocyte Counting

Multiple independent studies have documented that the presence and quantity of tumor-infiltrating lymphocytes (TILs) are strongly correlated with increased survival. However, because of methodological factors, the exact effect of TILs on prognosis has remained enigmatic, and inclusion of TILs in standard prognostic panels has been limited. To address this limitation, we introduced a robust digital DNA-based assay, termed QuanTILfy, to count TILs and assess T cell clonality in tissue samples, including tumors.


Single Cell Transcriptomics

Characterizing the transcriptome of individual cells is fundamental to understanding complex biological systems. We validated a fully-integrated and robust droplet-based system that enables 3’ mRNA digital profiling of thousands of single cells in a highly multiplex fashion. Cell encapsulation, of up to 8 samples at a time, takes place in ∼6 min, with ∼50% cell capture efficiency. 

Single Cell Transcriptomics figure
Figure 1. (a) scRNA-seq workflow on GemCode technology platform. Cells were combined with reagents in one channel of a microfluidic chip, and gel beads from another channel to form GEMs. RT takes place inside each GEM, after which cDNAs are pooled for amplification and library construction in bulk. (b) Gel beads loaded with primers and barcoded oligonucleotides are first mixed with cells and reagents, and subsequently mixed with oil-surfactant solution at a microfluidic junction. Single-cell GEMs are collected in the GEM outlet. (c) Percentage of GEMs containing 0 gel bead (N=0), 1 gel bead (N=1) and >1 gel bead (N>1). Data include five independent runs from multiple chip and gel bead lots over >70k GEMs for each run, n=5, mean±s.e.m. (d) Gel beads contain barcoded oligonucleotides consisting of Illumina adapters, 10x barcodes, UMIs and oligo dTs, which prime RT of polyadenylated RNAs. (e) Finished library molecules consist of Illumina adapters and sample indices, allowing pooling and sequencing of multiple libraries on a next-generation short read sequencer. (f) CellRanger pipeline workflow. Gene-barcode matrix (highlighted in green) is an output of the pipeline.

Single-cell RNA-sequencing (scRNA-seq) can be used to dissect transcriptomic heterogeneity that is masked in population-averaged measurements. scRNA-seq studies have led to the discovery of novel cell types and provided insights into regulatory networks during development. However, previously described scRNA-seq methods face practical challenges when scaling to tens of thousands of cells or when it is necessary to capture as many cells as possible from a limited sample. Commercially available, microfluidic-based approaches have limited throughput. Plate-based methods often require time-consuming fluorescence-activated cell sorting (FACS) into many plates that must be processed separately. Droplet-based techniques have enabled processing of tens of thousands of cells in a single experiment, but current approaches require generation of custom microfluidic devices and reagents.

To overcome these challenges, we developed a droplet-based system that enables 3′ messenger RNA (mRNA) digital counting of thousands of single cells (Figure 1). Approximately 50% of cells loaded into the system can be captured, and up to eight samples can be processed in parallel per run. Reverse transcription takes place inside each droplet, and barcoded complementary DNAs (cDNAs) are amplified in bulk. The resulting libraries then undergo Illumina short-read sequencing. An analysis pipeline, Cell Ranger, processes the sequencing data and enables automated cell clustering. Here we first demonstrated comparable sensitivity of the system to existing droplet-based methods by performing scRNA-seq on cell lines and synthetic RNAs. Next, we profiled 68k fresh peripheral blood mononuclear cells (PBMCs) (Figure 2) and demonstrated the scRNA-seq platform’s ability to dissect large immune populations. Last, we developed a computational method to distinguish donor from host cells in bone marrow transplant samples by genotype. We combined this method with clustering analysis to compare subpopulation changes in acute myeloid leukemia (AML) patients. This analysis enables transplant monitoring of the complex interplay between donor and host cells.

Sequence variation in the transcriptome data
Figure 2. tSNE projection of 68k PBMCs, with each cell coloured based on their correlation-based assignment to a purified subpopulation of PBMCs. Subclusters within T cells are marked by dashed polygons. NK, natural killer cells; reg T, regulatory T cells.

We demonstrated the scalability and robustness of the system through transcriptome analysis of ∼250k single cells across 29 samples. scRNA-seq of cell lines and synthetic RNAs showed the system’s comparable sensitivity to other droplet-based methods.

The GemCode technology platform enables high-throughput scRNA-seq with rapid cell encapsulation and a high cell capture rate that addresses the challenges associated with existing scRNA-seq platforms. Single gel beads are encapsulated into GEMS at ∼80% fill rate. This fill rate combined with Poisson loading of cells results in ∼50% cell capture rate, enabling the processing of samples with limited cell input material. We demonstrate the ability to load from 1,000 to 23,000 cells per well, from four different cell lines and two primary cell types (PBMCs and BMMCs), illustrating the applicability of the GemCode platform to a wide variety of cell types. The GEM-based encapsulation of single cells within the microfluidics platform reduces the need for expensive sorting equipment and complicated workflows involving large numbers of plates. The scalability and high-throughput nature of the GemCode platform is achieved in two ways: hundreds to thousands of cells can be encapsulated per channel, and each chip has eight channels. Therefore, a large number of cells can be processed within a very short period of time, minimizing the perturbation of the cellular transcriptome. In addition, multiple samples can be processed simultaneously, a key advantage for experimental setups that involve a time course or multiple treatments.

Related News


Random Mutation Detection

Illustrated overview of the 3D and dRMC assays for the quantification of mitochondrial mutations
Figure 1. Illustrated overview of the 3D and dRMC assays for the quantification of mitochondrial mutations. (1) Whole cell DNA is extracted. (2) mtDNA is incubated with TaqI restriction endonuclease, which recognizes 5’-TCGA-3’ sites. mtDNA that are wild-type at TaqI sites (WT, blue), will be cleaved, whereas mtDNA with a mutation in the mutation target site (red) will be resistant to cleavage. A control region devoid of TaqI sites (purple) is used to quantify total mtDNA copies interrogated. (3) Digested DNA is added to a PCR mastermix with site-specific primers which flank the mutational target and Taqman probes, and then partitioned into thousands of 1 nl droplets in an oil immersion. The control region and mtDNA with mutations in the target site act as substrates for amplification, whereas mtDNA which are WT at the mutational target are not. (4) Droplets are thermal cycled to amplify target DNA as well as release the Taqman probe fluorophore from its quencher through Taq polymerase’s inherent exonuclease activity. The ongoing rounds of amplification displace and cleave more probe, accumulating fluorescence. (5) Postamplification, droplets are detected and their fluorescence is quantified. Mutation frequency is calculated by dividing the mutant concentration by the concentration of the control region.

Previous mutational assays able to identify rare random spontaneous mutations have ultimately been restricted to model systems. Although tissue culture and transgenic animal systems are powerful tools for identifying potential mutagens, they cannot accurately predict mutagenesis in humans. To permit the measurement of rare random mutation in human tissues, we developed the Random Mutation Capture (RMC) assay (Figure 1). The RMC assay is >100-fold more sensitive than previous methods that employ genomic selection, permits analysis of a large number of nucleotides, and can identify one mutant base pair among 109 wild-type base pairs.

It was with the development of this new technology that we were first able to provide the most convincing evidence to date for existence of a mutator phenotype in human cancers, a hypothesis proposed more than 30 years prior.

Although this assay was initially developed to study point mutation accumulation in the nuclear genome, we have since adapted it to resolve mitochondrial mutations and increased its resolution and throughput by “digitizing” the assay to more sensitively monitor base substitution and deletion mutations (Figure 1). This has allowed us to redefine the relationship among mitochondrial mutagenesis, cancer and aging.

For example, we recently demonstrated two surprising phenomena: 1) far fewer mitochondrial mutations arise in tumors than in normal healthy tissue, and 2) mitochondrial DNA exhibits mutagenic resistance to DNA-damaging agents.

Related News


Single Molecule Sequencing

Next-generation sequencing (NGS) technologies have transformed genomic research and have the potential to revolutionize clinical medicine. However, the background error rates of sequencing instruments and limitations in targeted read coverage have precluded the detection of rare DNA sequence variants by NGS. We have developed a method, termed CypherSeq, that combines double-stranded barcoding error correction and rolling circle amplification (RCA)-based target enrichment to vastly improve NGS-based rare variant detection.

CypherSeq
Figure 1. Diagram of CypherSeq construct and sequencing workflow. (A) The vector consists of a pUC19 plasmid backbone into which the sequencing cassette has been ligated. The sequencing cassette consists of the Nextera adaptors (flow cell sequences, indexes and read primer sequences) and two 7-nt double-stranded, random barcodes flanking a blunt-ended restriction site (SmaI). (B) Sheared DNA containing wild-type (black line) or mutated (red) bases are ligated into the sequencing library vector at the Smal site. The resultant sequencing library is amplified to generate a family for each barcode and then sequenced on an Illumina flow cell. The reads are grouped together into barcode families. True mutations (red) will be observed in all or most (>90%) of the barcode family, while mutations arising from PCR-introduced errors or sequencing error (green) will be present in a small fraction of the barcode family (<90%).

CypherSeq (Figure 1) is designed to overcome the three main barriers to rare variant detection: (i) error correction, (ii) read depth and (iii) enrichment. CypherSeq employs double-stranded molecular barcoding to achieve high sensitivity base calling. Additionally, we exploit the circular nature of the plasmid-based sequencing library to enrich for specific targets using rolling circle amplification (RCA) based enrichment (Figure 2) to reduce off-target reads and maximize read depth. CypherSeq's combination of accuracy and enrichment will enable the full potential of personalized, sequencing-based clinical applications to be realized.

The number of reads produced from an NGS instrument is an important factor for rare variant detection. The coverage depth required at a site in order to detect a variant is inversely proportional to its frequency within a sample, requiring ever greater depth to detect rarer variants. For example, detecting a variant in ‘gene X’ present in 1 out of every 105 genomes would require at least 105 coverage of ‘gene X’. 105 reads is not difficult to achieve, however with conventional approaches the rest of the genome, roughly 3 × 109 bp, would also be sequenced at a depth of 105, requiring 2.4 × 1012 (2.4 trillion) 125 bp reads or the equivalent of 1200 HiSeq lanes, which is cost prohibitive. This problem is compounded when combined with error correcting sequencing technologies which, due to the need for redundant barcoded reads, reduce the number of unique reads produced. As there are practical constraints on the read yield available from current sequencing platforms, detection of extremely rare variants cannot be performed quantitatively for each site genome-wide and must be limited to specific genomic targets of interest. In order to ensure adequate read depth, target sequences must be enriched within the heterogeneous input sample to limit off-target sequence reads.

The CypherSeq methodology incorporates the error-correcting capabilities of double-stranded barcodes into a circular construct that carries all the components required for NGS. The sequencing construct is cloned into a bacterial plasmid, and thus permits the replication and storage of the barcoded CypherSeq vectors in bacteria, whereas its circular nature allows for enrichment and amplification of specific targets via RCA. The CypherSeq workflow is compatible across many NGS platforms including the Illumina, Ion Torrent, Pacific Bio, 454 and SMRT systems, and is also capable of large-scale multiplexing using conventional indexes.

RCA
Figure 2. Overview of rolling circle amplification (RCA) enrichment from CypherSeq libraries. A CypherSeq vector library is amplified by extension of biotinylated, target-specific primers using the strand displacement synthesis-proficient polymerase Bst. Two primers, one targeting each of the complementary strands, must be used to achieve double-strand molecular barcoded error correction. Template CypherSeq vectors containing non-target sequences remain unamplified while templates containing the target sequence are amplified via RCA into long single-stranded products containing redundant copies of the target sequence and sequencing cassette. Unlike conventional PCR, each redundant copy of the target sequence is copied directly from the original DNA fragment. Thus, errors occurring in early rounds of amplification are not reproduced in later duplications, preventing exponential amplification of error. The RCA products are purified using magnetic streptavidin-coated beads, subjected to limited PCR with the library preparation primers (Supplementary Table SI4), and sequenced. The error correction methodology is performed identically to samples not subjected to enrichment. Namely, sequencing reads are compiled by barcode and a consensus is made for each barcode family independently. Substitutions occurring in <90% of the reads within a family are rejected as artifacts, while substitutions present in all or nearly all (>90%) of a family are accepted as true mutations.

We demonstrate that CypherSeq corrects errors inherent in NGS sequencing outputs allowing detection of mutations down to a frequency of 2.4 × 10−7 per base pair. However, the sensitivity of the CypherSeq methodology is likely even greater, as double-stranded barcoding-based error correction can theoretically permit the resolution of mutation frequencies as low as 10−9–10−10 per nucleotide and depends upon the number of unique reads generated.

Translation of robust rare variant detection methods, such as CypherSeq, to the clinic have the potential to dramatically transform disease diagnostics, monitoring and prognostication. Circulating tumor DNA (ctDNA) and circulating tumor cells (CTCs) are detectable in the blood of most patients with advanced cancer and in a significant percentage of patients in the early stages of cancer. Early cancer diagnosis is currently the most promising approach to reducing mortality, as early detection is associated with more favorable prognosis for nearly all cancer types. Reliable detection of early-stage cancer, by quantifying ctDNA or CTCs marked by cancer-specific mutations, will require the most highly sensitive and specific rare variant detection assays to enable screening in a vast background of wild-type normal cells. By exploiting CypherSeq's highly sensitive error correction abilities and by targeting the enrichment step to a panel of genes known to be mutated in cancers, we expect CypherSeq will be able to achieve the sensitivity and specificity required for the early detection of disease.

Related News


Digital T-Lymphocyte Counting

Digital genomic quantification of T lymphocytes
Figure 1. QuanTILfy technology. Frozen punch biopsies from tumors are subjected to genomic DNA sequence analysis in water-in-oil droplets. The rearranged TCRβ loci were amplified from genomic DNA with PCR probes that contained a 6-carboxyfluorescein (FAM) detection fluorophore. Fluorescence was measured in the droplets and used to determine the number of TCR rearrangements and clonality. High TIL tissue samples contained a high number of amplifications within the 8 TCR rearrangements subgroups (indicated by the colored circles). Shown at the top of the figure are histology sections of ovarian cancer tumors with high and low infiltration of immune cells (hematoxylin and eosin, 40×). TILs are indicated by a white segments line (left) and arrows (right). V. ALTOUNIAN/SCIENCE TRANSLATIONAL MEDICINE (ILLUSTRATION); J. GUENTHOER/FRED HUTCHINSON CANCER RESEARCH CENTER (HISTOLOGY)

The human cellular adaptive immune system identifies and destroys cells expressing aberrant proteins or protein fragments. The source of the abnormal protein fragments can include intracellular pathogenic infection, genomic mutations, or deregulation of gene expression. Cancerous cells often express such aberrant peptides, prompting a cellular adaptive immune response. These peptides are presented on the surface of cells by human leukocyte antigen molecules for binding by T cell receptors (TCRs) on the surface of T-lymphocytes, the primary mediators of the cellular adaptive immune response.

Tumor-infiltrating lymphocytes (TILs) have been shown to directly attack tumor cells in a variety of types of cancer, and multiple independent studies have demonstrated that the presence of TILs is strongly correlated with increased survival. For both colorectal and ovarian carcinoma patients, the presence or absence of TILs provides a strong prognostic marker for survival independent of current staging methods. However, existing assays and pathology tests to measure TILs are cumbersome, have inherent variability, are mostly restricted to research studies, and thus are not used for clinical decision-making.

As the importance of TILs gains appreciation, particularly given their potential utility for cancer prognostication and their role in immunotherapeutic response, new technologies to quantitatively measure TILs are needed. Fortunately, adaptive immune cells have a molecular signature that can be exploited for direct measurement. T cells have gene rearrangements in their TCR loci. The nucleotide sequences that encode the TCR regions are generated by somatic rearrangement of noncontiguous variable (V), diversity (D), and joining (J) region gene segments for the β chain, and V and J segments for the α chain. The existence of multiple V, D, and J gene segments in germline DNA permits substantial combinatorial diversity in receptor composition, and receptor diversity is further increased by the deletion of nucleotides adjacent to the recombination signal sequences (RSSs) of the V, D, and J segments, and template-independent insertion of nucleotides at the Vβ-Dβ, Dβ-Jβ, and Vα-Jα junctions.

We have developed QuanTILfy to measure the number of T-lymphocytes and assess clonality in a tissue using droplet digital polymerase chain reaction (ddPCR) technology. The massive sample partitioning is a key aspect of the ddPCR technique and a vital component of the QuanTILfy assay (Figure 1). ddPCR surpasses the performance of earlier techniques by introducing a scalable implementation of digital PCR, where the creation of tens of thousands of droplets allows for the generation of tens of thousands of data points, bringing the power of statistical analysis inherent to digital PCR into practical application.

Related News