October 27, 2020
In recent years, there has been rapid development of spatial gene expression technology. The most mature of these are the closely related Spatial Transcriptomics and Visium platforms from 10X Genomics, which allow for gene expression profiling to be done with morphological context. Already, these spatial technologies have shown promise to generate novel biological insights in diverse areas such as cancer immunology and prefrontal cortex cytoarchitecture. However, the spatial technology is limited to resolution on the order of 10 cells per spot (the unit of observation for these data). In the meantime, development of analytical tools that effectively utilize the available spatial information and enhance the gene expression maps to higher resolution have lagged behind.
We developed BayesSpace to address these two primary challenges to clustering analysis of spatial expression data. BayesSpace models a low-dimensional representation of the gene expression using a multivariate tdistribution and then incorporates spatial context through the Potts model, which encourages neighboring spots to belong to the same cluster. This approach draws from well-established spatial statistics methods for image analysis. Effectively leveraging spatial information results in a massive improvement in clustering performance.
The principle of encouraging neighbors to belong to the same cluster can also be used to enhance the resolution of the spots to a “sub-spot” level. Here sub-spots are encouraged to belong to the same cluster. Sub-spot level expression cannot be directly observed, but can be estimated via MCMC by repeatedly jittering spot-level expression values. The details of this are available in the Methods of our manuscript. In practice, we find that enhanced resolution clustering can identify important biological structures that would otherwise be missed.