Objective This study aimed to develop a novel, regulatory-compliant approach for openly exposing integrated clinical and environmental exposures data: the Integrated Clinical and Environmental Exposures Service (ICEES). Materials and Methods The driving clinical use case for research and development of ICEES was asthma, which is a common disease influenced by hundreds of genes and a plethora of environmental exposures, including exposures to airborne pollutants. We developed a pipeline for integrating clinical data on patients with asthma-like conditions with data on environmental exposures derived from multiple public data sources. The data were integrated at the patient and visit level and used to create de-identified, binned, “integrated feature tables,” which were then placed behind an OpenAPI. Results Our preliminary evaluation results demonstrate a relationship between exposure to high levels of particulate matter ≤2.5 µm in diameter (PM2.5) and the frequency of emergency department or inpatient visits for respiratory issues. For example, 16.73% of patients with average daily exposure to PM2.5 >9.62 µg/m3 experienced 2 or more emergency department or inpatient visits for respiratory issues in year 2010 compared with 7.93% of patients with lower exposures (n = 23 093). Discussion The results validated our overall approach for openly exposing and sharing integrated clinical and environmental exposures data. We plan to iteratively refine and expand ICEES by including additional years of data, feature variables, and disease cohorts. Conclusions We believe that ICEES will serve as a regulatory-compliant model and approach for promoting open access to and sharing of integrated clinical and environmental exposures data.
Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph‐based data models elucidate the interconnectedness among core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these “knowledge graphs” (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally accepted, open‐access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open‐source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object‐oriented classification and graph‐oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates) representing biomedical entities such as gene, disease, chemical, anatomic structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and helping to realize the goals of translational science.
Background: Informatics tools to support the integration and subsequent interrogation of spatiotemporal data such as clinical data and environmental exposures data are lacking. Such tools are needed to support research in environmental health and any biomedical field that is challenged by the need for integrated spatiotemporal data to examine individual-level determinants of health and disease. Results: We have developed an open-source software application-FHIR PIT (Health Level 7 Fast Healthcare Interoperability Resources Patient data Integration Tool)-to enable studies on the impact of individual-level environmental exposures on health and disease. FHIR PIT was motivated by the need to integrate patient data derived from our institution's clinical warehouse with a variety of public data sources on environmental exposures and then openly expose the data via ICEES (Integrated Clinical and Environmental Exposures Service). FHIR PIT consists of transformation steps or building blocks that can be chained together to form a transformation and integration workflow. Several transformation steps are generic and thus can be reused. As such, new types of data can be incorporated into the modular FHIR PIT pipeline by simply reusing generic steps or adding new ones. We validated FHIR PIT in the context of a driving use case designed to investigate the impact of airborne pollutant exposures on asthma. Specifically, we replicated published findings demonstrating racial disparities in the impact of airborne pollutants on asthma exacerbations. Conclusions: While FHIR PIT was developed to support our driving use case on asthma, the software can be used to integrate any type and number of spatiotemporal data sources at a level of granularity that enables individual-level study. We expect FHIR PIT to facilitate research in environmental health and numerous other biomedical disciplines.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.