2023
DOI: 10.1101/2023.05.05.539647
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

SpatialData: an open and universal data framework for spatial omics

Abstract: Spatially resolved omics technologies are transforming our understanding of biological tissues. However, handling uni- and multi-modal spatial omics datasets remains a challenge owing to large volumes of data, heterogeneous data types and the lack of unified spatially-aware data structures. Here, we introduce SpatialData, a framework that establishes a unified and extensible multi-platform file-format, lazy representation of larger-than-memory data, transformations, and alignment to common coordinate systems. … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 14 publications
(11 citation statements)
references
References 37 publications
0
11
0
Order By: Relevance
“…Finally, as spatial omics datasets continue to increase in size, in the future, we anticipate spatial omics datasets may need to be stored in standardized data infrastructure with lazy representation of larger-than-memory data such as the Zarr file format used in the SpatialData Python library (Marconato et al, 2023). SEraster can potentially be integrated with such data infrastructure to enable rasterization of larger-than-memory data in spatially-indexed chunks rather than loading the entire dataset into memory.…”
Section: Discussionmentioning
confidence: 99%
“…Finally, as spatial omics datasets continue to increase in size, in the future, we anticipate spatial omics datasets may need to be stored in standardized data infrastructure with lazy representation of larger-than-memory data such as the Zarr file format used in the SpatialData Python library (Marconato et al, 2023). SEraster can potentially be integrated with such data infrastructure to enable rasterization of larger-than-memory data in spatially-indexed chunks rather than loading the entire dataset into memory.…”
Section: Discussionmentioning
confidence: 99%
“…Similarly, it underlies our seamless integration with co-registration methods such that multiple spatial technologies can be jointly queried or analyzed together. In this manner, our approach provides more flexibility than the recently developed SpatialData 58 package, which enforces a standard data framework. Notably, to accommodate the increasing size of spatial multi-modal datasets, we developed GiottoDB , which provides the groundwork that developers and users can use to represent their data through different backends that can scale according to their needs.…”
Section: Discussionmentioning
confidence: 99%
“…To establish versatile tools, a common strategy involves adopting a shared data structure that seamlessly integrates across diverse technologies. SpatialData 29 serves as one such comprehensive framework, including readers tailored for the most widely used spatial-omics technologies. Building upon this, Sopa converts any data into a SpatialData object, on which all of the six following tasks are performed.…”
Section: Technology-invariant Pipelinementioning
confidence: 99%
“…This also facilitates geometry-related operations, such as cell expansion, area/perimeter computations, and cell-cell intersections. Combined with the image lazy loading offered by SpatialData 29 and Xarray 34 , we implement a fast channel averaging on cell boundaries by combining geometry operations and image chunk lazy loading (see Figure 2d), i.e., deferring loading until needed for processing. Additionally, using memory-efficient tools like Dask 31 , we extend geometric operations of GeoPandas 32 on chunks of transcripts, ensuring parallel processing of as many chunks as possible without exceeding memory limits (see Figure 2e).…”
Section: Memory Efficiency Of Sopamentioning
confidence: 99%
See 1 more Smart Citation