ObjectiveThe gold standard for diagnosing sleep disorders is polysomnography, which generates extensive data about biophysical changes occurring during sleep. We developed the National Sleep Research Resource (NSRR), a comprehensive system for sharing sleep data. The NSRR embodies elements of a data commons aimed at accelerating research to address critical questions about the impact of sleep disorders on important health outcomes.ApproachWe used a metadata-guided approach, with a set of common sleep-specific terms enforcing uniform semantic interpretation of data elements across three main components: (1) annotated datasets; (2) user interfaces for accessing data; and (3) computational tools for the analysis of polysomnography recordings. We incorporated the process for managing dataset-specific data use agreements, evidence of Institutional Review Board review, and the corresponding access control in the NSRR web portal. The metadata-guided approach facilitates structural and semantic interoperability, ultimately leading to enhanced data reusability and scientific rigor.ResultsThe authors curated and deposited retrospective data from 10 large, NIH-funded sleep cohort studies, including several from the Trans-Omics for Precision Medicine (TOPMed) program, into the NSRR. The NSRR currently contains data on 26 808 subjects and 31 166 signal files in European Data Format. Launched in April 2014, over 3000 registered users have downloaded over 130 terabytes of data.ConclusionsThe NSRR offers a use case and an example for creating a full-fledged data commons. It provides a single point of access to analysis-ready physiological signals from polysomnography obtained from multiple sources, and a wide variety of clinical data to facilitate sleep research.
Professional sleep societies have identified a need for strategic research in multiple areas that may benefit from access to and aggregation of large, multidimensional datasets. Technological advances provide opportunities to extract and analyze physiological signals and other biomedical information from datasets of unprecedented size, heterogeneity, and complexity. The National Institutes of Health has implemented a Big Data to Knowledge (BD2K) initiative that aims to develop and disseminate state of the art big data access tools and analytical methods. The National Sleep Research Resource (NSRR) is a new National Heart, Lung, and Blood Institute resource designed to provide big data resources to the sleep research community. The NSRR is a web-based data portal that aggregates, harmonizes, and organizes sleep and clinical data from thousands of individuals studied as part of cohort studies or clinical trials and provides the user a suite of tools to facilitate data exploration and data visualization. Each deidentified study record minimally includes the summary results of an overnight sleep study; annotation files with scored events; the raw physiological signals from the sleep record; and available clinical and physiological data. NSRR is designed to be interoperable with other public data resources such as the Biologic Specimen and Data Repository Information Coordinating Center Demographics (BioLINCC) data and analyzed with methods provided by the Research Resource for Complex Physiological Signals (PhysioNet). This article reviews the key objectives, challenges and operational solutions to addressing big data opportunities for sleep research in the context of the national sleep research agenda. It provides information to facilitate further interactions of the user community with NSRR, a community resource.
BackgroundThe National Sleep Research Resource (NSRR) is a large-scale, openly shared, data repository of de-identified, highly curated clinical sleep data from multiple NIH-funded epidemiological studies. Although many data repositories allow users to browse their content, few support fine-grained, cross-cohort query and exploration at study-subject level. We introduce a cross-cohort query and exploration system, called X-search, to enable researchers to query patient cohort counts across a growing number of completed, NIH-funded studies in NSRR and explore the feasibility or likelihood of reusing the data for research studies.MethodsX-search has been designed as a general framework with two loosely-coupled components: semantically annotated data repository and cross-cohort exploration engine. The semantically annotated data repository is comprised of a canonical data dictionary, data sources with a data dictionary, and mappings between each individual data dictionary and the canonical data dictionary. The cross-cohort exploration engine consists of five modules: query builder, graphical exploration, case-control exploration, query translation, and query execution. The canonical data dictionary serves as the unified metadata to drive the visual exploration interfaces and facilitate query translation through the mappings.ResultsX-search is publicly available at https://www.x-search.net/with nine NSRR datasets consisting of over 26,000 unique subjects. The canonical data dictionary contains over 900 common data elements across the datasets. X-search has received over 1800 cross-cohort queries by users from 16 countries.ConclusionsX-search provides a powerful cross-cohort exploration interface for querying and exploring heterogeneous datasets in the NSRR data repository, so as to enable researchers to evaluate the feasibility of potential research studies and generate potential hypotheses using the NSRR data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.