Journal Pre-proof J o u r n a l P r e -p r o o f 2 GIS-based Spatial Modeling of COVID-19 Incidence Rate in the Continental United States AbstractDuring the first 90 days of the COVID-19 outbreak in the United States, over 675,000 confirmed cases of the disease have been announced, posing unprecedented socioeconomic burden to the country. Due to inadequate research on geographic modeling of COVID-19, we investigated county-level variations of disease incidence across the continental United States. We compiled a geodatabase of 35 environmental, socioeconomic, topographic, and demographic variables that could explain the spatial variability of disease incidence. Further, we employed spatial lag and spatial error models to investigate spatial dependence and geographically weighted regression (GWR) and multiscale GWR (MGWR) models to locally examine spatial non-stationarity. The results suggested that even though incorporating spatial autocorrelation could significantly improve the performance of the global ordinary least square model; these models still represent a significantly poor performance compared to the local models. Moreover, MGWR could explain the highest variations (adj. R 2 : 68.1%) with the lowest AICc compared to the others. Mapping the effects of significant explanatory variables (i.e., income inequality, median household income, the proportion of black females, and the proportion of nurse practitioners) on spatial variability of COVID-19 incidence rates using MGWR could provide useful insights to policymakers for targeted interventions.
Prediction of the COVID-19 incidence rate is a matter of global importance, particularly in the United States. As of 4 June 2020, more than 1.8 million confirmed cases and over 108 thousand deaths have been reported in this country. Few studies have examined nationwide modeling of COVID-19 incidence in the United States particularly using machine-learning algorithms. Thus, we collected and prepared a database of 57 candidate explanatory variables to examine the performance of multilayer perceptron (MLP) neural network in predicting the cumulative COVID-19 incidence rates across the continental United States. Our results indicated that a single-hidden-layer MLP could explain almost 65% of the correlation with ground truth for the holdout samples. Sensitivity analysis conducted on this model showed that the age-adjusted mortality rates of ischemic heart disease, pancreatic cancer, and leukemia, together with two socioeconomic and environmental factors (median household income and total precipitation), are among the most substantial factors for predicting COVID-19 incidence rates. Moreover, results of the logistic regression model indicated that these variables could explain the presence/absence of the hotspots of disease incidence that were identified by Getis-Ord Gi* (p < 0.05) in a geographic information system environment. The findings may provide useful insights for public health decision makers regarding the influence of potential risk factors associated with the COVID-19 incidence at the county level.
Highlights Lower respiratory infections (LRI) are the cause of a significant number of hospitalizations in the US. No previous nationwide study examined geographic variations of LRI mortality rates and their association with underlying factors. There was a shift in the location of LRI hotspots from west coast to southeast over time. Decision tree classifiers could predict LRI mortality hotspots with high accuracies. Higher spring temperature and increased precipitation during winter were among the most substantial predictors of presence or absence of LRI hotspots.
Measurements of human interaction through proxies such as social connectedness or movement patterns have proved useful for predictive modeling of COVID-19, which is a challenging task, especially at high spatial resolutions. In this study, we develop a Spatiotemporal autoregressive model to predict county-level new cases of COVID-19 in the coterminous US using spatiotemporal lags of infection rates, human interactions, human mobility, and socioeconomic composition of counties as predictive features. We capture human interactions through 1) Facebook- and 2) cell phone-derived measures of connectivity and human mobility, and use them in two separate models for predicting county-level new cases of COVID-19. We evaluate the model on 14 forecast dates between 2020/10/25 and 2021/01/24 over one- to four-week prediction horizons. Comparing our predictions with a Baseline model developed by the COVID-19 Forecast Hub indicates an average 6.46% improvement in prediction Mean Absolute Errors (MAE) over the two-week prediction horizon up to 20.22% improvement in the four-week prediction horizon, pointing to the strong predictive power of our model in the longer prediction horizons.
With COVID-19 affecting every country globally and changing everyday life, the ability to forecast the spread of the disease is more important than any previous epidemic. The conventional methods of disease-spread modeling, compartmental models, are based on the assumption of spatiotemporal homogeneity of the spread of the virus, which may cause forecasting to underperform, especially at high spatial resolutions. In this paper, we approach the forecasting task with an alternative technique—spatiotemporal machine learning. We present COVID-LSTM, a data-driven model based on a long short-term memory deep learning architecture for forecasting COVID-19 incidence at the county level in the USA. We use the weekly number of new positive cases as temporal input, and hand-engineered spatial features from Facebook movement and connectedness datasets to capture the spread of the disease in time and space. COVID-LSTM outperforms the COVID-19 Forecast Hub’s Ensemble model (COVIDhub-ensemble) on our 17-week evaluation period, making it the first model to be more accurate than the COVIDhub-ensemble over one or more forecast periods. Over the 4-week forecast horizon, our model is on average 50 cases per county more accurate than the COVIDhub-ensemble. We highlight that the underutilization of data-driven forecasting of disease spread prior to COVID-19 is likely due to the lack of sufficient data available for previous diseases, in addition to the recent advances in machine learning methods for spatiotemporal forecasting. We discuss the impediments to the wider uptake of data-driven forecasting, and whether it is likely that more deep learning-based models will be used in the future.
Academic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident cases, incident hospitalizations, incident deaths, and cumulative deaths due to COVID-19 at county, state, and national, levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.