BayesSpace – a tool for spatial transcriptomics analysis

October 27, 2020

Edward Zhao

In recent years, there has been rapid development of spatial gene expression technology. The most mature of these are the closely related Spatial Transcriptomics and Visium platforms from 10X Genomics, which allow for gene expression profiling to be done with morphological context. Already, these spatial technologies have shown promise to generate novel biological insights in diverse areas such as cancer immunology and prefrontal cortex cytoarchitecture. However, the spatial technology is limited to resolution on the order of 10 cells per spot (the unit of observation for these data). In the meantime, development of analytical tools that effectively utilize the available spatial information and enhance the gene expression maps to higher resolution have lagged behind.

We developed BayesSpace to address these two primary challenges to clustering analysis of spatial expression data. BayesSpace models a low-dimensional representation of the gene expression using a multivariate tdistribution and then incorporates spatial context through the Potts model, which encourages neighboring spots to belong to the same cluster. This approach draws from well-established spatial statistics methods for image analysis. Effectively leveraging spatial information results in a massive improvement in clustering performance.

 

Figure 1.

Figure 1. (Left) Maynard et al. manually annotated the layers of the dorsolateral prefrontal cortex. This represents the ground truth. (Center) Louvain clustering is a commonly used non-spatial clustering method. This fails to identify distinct layers and results in substantial noise. (Right) BayesSpace leverages spatial context to improve identification of the prefrontal cortex layers.

The principle of encouraging neighbors to belong to the same cluster can also be used to enhance the resolution of the spots to a “sub-spot” level. Here sub-spots are encouraged to belong to the same cluster. Sub-spot level expression cannot be directly observed, but can be estimated via MCMC by repeatedly jittering spot-level expression values. The details of this are available in the Methods of our manuscript. In practice, we find that enhanced resolution clustering can identify important biological structures that would otherwise be missed.

 

Figure 2.

Figure 2. (Left) Thrane et al. annotated a melanoma histopathology image for major tissue structures. The tumor is outlined in black. (Center) Clustering at the original resolution recovers most of the important structures. However, lymphoid tissue near the tumor cannot be distinguished. (Right) With enhanced resolution, we see regions matching the lymphoid structures near the tumor.

To make it easy for users to include BayesSpace in their analysis workflows, we designed BayesSpace to seamlessly integrate with existing analytical tools by using the SingleCellExperiment data structure. BayesSpace is now available on the development version of Bioconductor.