March through May 2020, a model of novel coronavirus (COVID-19) disease progression and treatment was constructed for the open-source Synthea patient simulation. The model was constructed using three peer-reviewed publications published in the early stages of the global pandemic, when less was known, along with emerging resources, data, publications, and clinical knowledge. The simulation outputs synthetic Electronic Health Records (EHR), including the daily consumption of Personal Protective Equipment (PPE) and other medical devices and supplies. For this simulation, we generated 124,150 synthetic patients, with 88,166 infections and 18,177 hospitalized patients. Patient symptoms, disease severity, and morbidity outcomes were calibrated using clinical data from the peer-reviewed publications. 4.1% of all simulated infected patients died and 20.6% were hospitalized. At peak observation, 548 dialysis machines and 209 mechanical ventilators were needed. This simulation and the resulting data have been used for the development of algorithms and prototypes designed to address the current or future pandemics, and the model can continue to be refined to incorporate emerging COVID-19 knowledge, variations in patterns of care, and improvement in clinical outcomes. The resulting model, data, and analysis are available as open-source code on GitHub and an open-access data set is available for download.
The “Coherent Data Set” is a novel synthetic data set that leverages structured data from Synthea™ to create a longitudinal, “coherent” patient-level electronic health record (EHR). Comprised of synthetic patients, the Coherent Data Set is publicly available, reproducible using Synthea™, and free of the privacy risks that arise from using real patient data. The Coherent Data Set provides complex and representative health records that can be leveraged by health IT professionals without the risks associated with de-identified patient data. It includes familial genomes that were created through a simulation of the genetic reproduction process; magnetic resonance imaging (MRI) DICOM files created with a voxel-based computational model; clinical notes in the style of traditional subjective, objective, assessment, and plan notes; and physiological data that leverage existing System Biology Markup Language (SBML) models to capture non-linear changes in patient health metrics. HL7 Fast Healthcare Interoperability Resources (FHIR®) links the data together. The models can generate clinically logical health data, but ensuring clinical validity remains a challenge without comparable data to substantiate results. We believe this data set is the first of its kind and a novel contribution to practical health interoperability efforts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.