To seek new signatures of illness in heart rate and oxygen saturation vital signs from Neonatal Intensive Care Unit (NICU) patients, we implemented highly comparative time-series analysis to discover features of all-cause mortality in the next 7 days. We collected 0.5 Hz heart rate and oxygen saturation vital signs of infants in the University of Virginia NICU from 2009 to 2019. We applied 4998 algorithmic operations from 11 mathematical families to random daily 10 min segments from 5957 NICU infants, 205 of whom died. We clustered the results and selected a representative from each, and examined multivariable logistic regression models. 3555 operations were usable; 20 cluster medoids held more than 81% of the information, and a multivariable model had AUC 0.83. New algorithms outperformed others: moving threshold, successive increases, surprise, and random walk. We computed provenance of the computations and constructed a software library with links to the data. We conclude that highly comparative time-series analysis revealed new vital sign measures to identify NICU patients at the highest risk of death in the next week.
Results of computational analyses require transparent disclosure of their supporting resources, while the analyses themselves often can be very large scale and involve multiple processing steps separated in time. Evidence for the correctness of any analysis consists of accessible data and software with runtime environment and personnel involved. Evidence graphs - a derivation of argumentation frameworks adapted to biological science - can provide this disclosure as machine-readable metadata resolvable from persistent identifiers for computationally generated graphs, images, or tables, that can be archived and cited in a publication including a persistent ID. We have built a cloud-based, computational research commons for predictive analytics on biomedical time series datasets with hundreds of algorithms and thousands of computations using a reusable computational framework we call FAIRSCAPE. FAIRSCAPE computes a complete chain of evidence on every result, including software, computations, and datasets. An ontology for Evidence Graphs, EVI (https://w3id.org/EVI), supports inferential reasoning over the evidence. FAIRSCAPE can run nested or disjoint workflows and preserves the provenance graph across them. It can run Apache Spark jobs, scripts, workflows, or user-supplied containers. All objects are assigned persistent IDs, including software. All results are annotated with FAIR metadata using the evidence graph model for access, validation, reproducibility, and re-use of archived data and software. FAIRSCAPE is a reusable computational framework, enabling simplified access to modern scalable cloud-based components. It fully implements the FAIR data principles and extends them to provide FAIR Evidence, including provenance of datasets, software and computations, as metadata for all computed results
Objective: Signatures of illness in vital signs of Neonatal Intensive Care Unit (NICU) patients can inform on future adverse events and outcomes. We implemented highly comparative time-series analysis to discover features and predictive analytics tools for all-cause mortality in the next 7 days, using the ubiquitous HR and SpO2 vital sign data from bedside monitors. Design: We populated a Time Series Commons with the complete HR and SpO2 data from all infants in the University of Virginia NICU from 2009 to 2019. We calculated the results of applying over 80 members of 11 mathematical families to random ten-minute segments of 0.5Hz data each day for each infant, with varying parameter sets, resulting in 4998 algorithmic operations on each infant. We used an unsupervised mutual information-based method to cluster the results, and we selected a single representative operation from each cluster. We used our FAIRSCAPE framework to compute a detailed provenance of all computations, and we constructed a complete software library with links to the analyzed data for reproducibility and reuse. We made multivariable logistic regression models using the lasso to assay the usefulness of the algorithms. Setting: Neonatal ICU. Patients: 5957 NICU infants, of whom 206 died. Measurements and main results: 3555 algorithmic operations returned usable results. Twenty representative operations, selected from each of 20 unsupervised clusters, held more than 81% of the information predicting death. A multivariable model had an AUC of 0.81 for predicting death in the next 7 days. In addition, five algorithms outperformed others: moving threshold, successive increases, surprise, and a random walk model. Conclusions: Highly comparative time-series analysis revealed new vital sign metrics to identify NICU patients at the highest risk of death in the next week. This approach can facilitate the discovery of signatures of impending, potentially actionable, clinical decompensation in monitored patients.
Results of computational analyses require transparent disclosure of their supporting resources, while the analyses themselves often can be very large scale and involve multiple processing steps separated in time. Evidence for the correctness of any analysis should include not only a textual description, but also a formal record of the computations which produced the result, including accessible data and software with runtime parameters, environment, and personnel involved. This article describes FAIRSCAPE, a reusable computational framework, enabling simplified access to modern scalable cloud-based components. FAIRSCAPE fully implements the FAIR data principles and extends them to provide fully FAIR Evidence, including machine-interpretable provenance of datasets, software and computations, as metadata for all computed results. The FAIRSCAPE microservices framework creates a complete Evidence Graph for every computational result, including persistent identifiers with metadata, resolvable to the software, computations, and datasets used in the computation; and stores a URI to the root of the graph in the result’s metadata. An ontology for Evidence Graphs, EVI (https://w3id.org/EVI), supports inferential reasoning over the evidence. FAIRSCAPE can run nested or disjoint workflows and preserves provenance across them. It can run Apache Spark jobs, scripts, workflows, or user-supplied containers. All objects are assigned persistent IDs, including software. All results are annotated with FAIR metadata using the evidence graph model for access, validation, reproducibility, and re-use of archived data and software.
Introduction: Transparency of computation is a requirement for assessing the validity of computed results and research claims based upon them; and it is essential for access to, assessment, and reuse of computational components. These components may be subject to methodological or other challenges over time. While reference to archived software and/or data is increasingly common in publications, a single machine-interpretable, integrative representation of how results were derived, that supports defeasible reasoning, has been absent. Methods: We developed the Evidence Graph Ontology, EVI, in OWL 2, with a set of inference rules, to provide deep representations of supporting and challenging evidence for computations, services, software, data, and results, across arbitrarily deep networks of computations, in connected or fully distinct processes. EVI integrates FAIR practices on data and software, with important concepts from provenance models, and argumentation theory. It extends PROV for additional expressiveness, with support for defeasible reasoning. EVI treats any com- putational result or component of evidence as a defeasible assertion, supported by a DAG of the computations, software, data, and agents that produced it. Results: We have successfully deployed EVI for very-large-scale predictive analytics on clinical time-series data. Every result may reference its own evidence graph as metadata, which can be extended when subsequent computations are executed. Discussion: Evidence graphs support transparency and defeasible reasoning on results. They are first-class computational objects, and reference the datasets and software from which they are derived. They support fully transparent computation, with challenge and support propagation. The EVI approach may be extended to include instruments, animal models, and critical experimental reagents.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.