In response to the emergence of SARS-CoV-2 variants of concern, the global scientific community, through unprecedented effort, has sequenced and shared over 11 million genomes through GISAID, as of May 2022. This extraordinarily high sampling rate provides a unique opportunity to track the evolution of the virus in near real-time. Here, we present outbreak.info, a platform that currently tracks over 40 million combinations of Pango lineages and individual mutations, across over 7,000 locations, to provide insights for researchers, public health officials and the general public. We describe the interpretable visualizations available in our web application, the pipelines that enable the scalable ingestion of heterogeneous sources of SARS-CoV-2 variant data and the server infrastructure that enables widespread data dissemination via a high-performance API that can be accessed using an R package. We show how outbreak.info can be used for genomic surveillance and as a hypothesis-generation tool to understand the ongoing pandemic at varying geographic and temporal scales.In December 2019, a series of cases of pneumonia of unknown origin appeared in Wuhan, China and on 7 January 2020, the virus responsible for the diseases was identified as a novel coronavirus, SARS-CoV-2 (ref. 1 ). The first SARS-CoV-2 genome was made publicly available on 10 January 2020 (refs. 2,3 ). Since then, the global scientific community, through an unprecedented effort, has sequenced and shared over 11 million genomes through GISAID (https://gisaid.org/), as of May 2022 (ref. 4 ). To keep track of the evolving genetic diversity of SARS-CoV-2, Rambaut
The emergence of SARS-CoV-2 variants has prompted the need for near real-time genomic surveillance to inform public health interventions. In response to this need, the global scientific community, through unprecedented effort, has sequenced over 7 million genomes as of December 2021. The extraordinarily high sampling rate provides a unique opportunity to track the evolution of the virus in near real-time. Here, we present outbreak.info, a platform that can be used to track over 40 million combinations of PANGO lineages and individual mutations, across over 7,000 locations, to provide insights for researchers, public health officials, and the general public. We describe the data pipelines that enable the scalable ingestion and standardization of heterogeneous data on SARS-CoV-2 variants, the server infrastructure that enables the dissemination of the processed data, and the client-side applications that provide intuitive visualizations of the underlying data.
Point-scanning imaging systems are among the most widely used tools for high-resolution cellular and tissue imaging, benefitting from arbitrarily defined pixel sizes. The resolution, speed, sample preservation, and signal-to-noise ratio (SNR) of point-scanning systems are difficult to optimize simultaneously. We show these limitations can be mitigated via the use of Deep Learning-based supersampling of undersampled images acquired on a point-scanning system, which we term point-scanning super-resolution (PSSR) imaging. We designed a “crappifier” that computationally degrades high SNR, high pixel resolution ground truth images to simulate low SNR, low-resolution counterparts for training PSSR models that can restore real-world undersampled images. For high spatiotemporal resolution fluorescence timelapse data, we developed a “multi-frame” PSSR approach that utilizes information in adjacent frames to improve model predictions. In conclusion, PSSR facilitates point-scanning image acquisition with otherwise unattainable resolution, speed, and sensitivity. All the training data, models, and code for PSSR are publicly available at 3DEM.org . Editor’s summary Point-scanning super-resolution imaging uses deep learning to supersample undersampled images and enable time-lapse imaging of subcellular events. An accompanying “crappifier” rapidly generates quality training data for robust performance.
To combat the ongoing COVID-19 pandemic, scientists have been conducting research at breakneck speeds, producing over 52,000 peer reviewed articles within the first 12 months. In contrast, a little over 1,000 peer reviewed articles were published within the first 12 months of the SARS-CoV-1 pandemic starting in 2002. In addition to publications, there has also been an upsurge in clinical trials to develop vaccines and treatments, scientific protocols to study SARS-CoV-2, methodology for epidemiological modeling, and datasets spanning molecular studies to social science research. One of the largest challenges has been keeping track of the vast amounts of newly generated disparate data and research that exist in independent repositories. To address this issue, we developed outbreak.info, which provides a standardized, searchable interface of heterogeneous data resources on COVID-19 and SARS-CoV-2. Unifying metadata from 14 data repositories, we have assembled a collection of over 200,000 publications, clinical trials, datasets, protocols, and other resources as of October 2021. We used a rigorous schema to enforce a consistent format across different data sources and resource types, and linked related resources where possible. This enables users to quickly retrieve information across data repositories, regardless of resource type or repository location. Outbreak.info also combines the combined research library with spatiotemporal genomics data on SARS-CoV-2 variants and epidemiological data on COVID-19 cases and deaths. The web interface provides interactive visualizations and reports to explore the unified data and generate hypotheses. In addition to providing a web interface, we also publish the data we have assembled and standardized in a high performance public API and an R package. Finally, we discuss the challenges inherent in combining metadata from scattered and heterogeneous resources and provide recommendations to streamline this process to aid scientific research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.