The rapid pace of innovation in biological imaging and the diversity of its applications have prevented the establishment of a community-agreed standardized data format. We propose that complementing established open formats such as OME-TIFF and HDF5 with a next-generation file format such as Zarr will satisfy the majority of use cases in bioimaging. Critically, a common metadata format used in all these vessels can deliver truly findable, accessible, interoperable and reusable bioimaging data.
Vitessce is an open-source interactive visualization framework for exploration of multi-modal and spatially-resolved single-cell data, with a modular architecture compatible with transcriptomic, proteomic, genome-mapped, and imaging data types. Its modular, coordinated multiple view implementation facilitates a wide range of visualization tasks to support all common single-cell assays. Vitessce is a client-side web application designed to be integrated with computational analysis tools and data resources and does not require specialized server infrastructure. The software is available at http://vitessce.io.
Biological imaging is one of the most innovative fields in the modern biological sciences. New imaging modalities, probes, and analysis tools appear every few months and often prove decisive for enabling new directions in scientific discovery. One feature of this dynamic field is the need to capture new types of data and data structures. While there is a strong drive to make scientific data Findable, Accessible, Interoperable and Reproducible (FAIR, 1), the rapid rate of innovation in imaging impedes the unification and adoption of standardized data formats. Despite this, the opportunities for sharing and integrating bioimaging data and, in particular, linking these data to other "omics" datasets have never been greater; therefore, to every extent possible, increasing "FAIRness" of bioimaging data is critical for maximizing scientific value, as well as for promoting openness and integrity. In the absence of a common, FAIR format, two approaches have emerged to provide access to bioimaging data: translation and conversion. On-the-fly translation produces a transient representation of bioimage metadata and binary data but must be repeated on each use. In contrast, conversion produces a permanent copy of the data, ideally in an open format that makes the data more accessible and improves performance and parallelization in reads and writes. Both approaches have been implemented successfully in the bioimaging community but both have limitations. At cloud-scale, those shortcomings limit scientific analysis and the sharing of results. We introduce here next-generation file formats (NGFF) as a solution to these challenges.
A growing community is constructing a next-generation file format (NGFF) for bioimaging to overcome problems of scalability and heterogeneity. Organized by the Open Microscopy Environment (OME), individuals and institutes across diverse modalities facing these problems have designed a format specification process (OME-NGFF) to address these needs. This paper brings together a wide range of those community members to describe the format itself -- OME-Zarr -- along with tools and data resources available today to increase FAIR access and remove barriers in the scientific process. The current momentum offers an opportunity to unify a key component of the bioimaging domain -- the file format that underlies so many personal, institutional, and global data management and analysis tasks.
Recent advances in highly multiplexed imaging have enabled the comprehensive profiling of complex tissues in healthy and diseased states, facilitating the study of fundamental biology and human disease in spatially-resolved contexts at subcellular resolution. However, current computational infrastructure to distribute and visualize these data on the web remains complex to set up and maintain. To address these limitations, we have developed Viv—an open-source image visualization library for high-resolution multiplexed image data that is implemented in JavaScript and builds on modern web technologies. Viv directly renders Bio-Formats-compatible Zarr and OME-TIFF data formats. Three use cases, including integration into Jupyter Notebooks (https://github.com/hms-dbmi/vizarr) and a data portal, as well as an image viewer (http://avivator.gehlenborglab.org) demonstrate the capabilities of our proposed approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.