Background: Infectious diarrhea can lead to a considerable global disease burden. Thus, the accurate prediction of an infectious diarrhea epidemic is crucial for public health authorities. This study was aimed at developing an optimal random forest (RF) model, considering meteorological factors used to predict an incidence of infectious diarrhea in Jiangsu Province, China. Methods: An RF model was developed and compared with classical autoregressive integrated moving average (ARIMA)/X models. Morbidity and meteorological data from 2012 to 2016 were used to construct the models and the data from 2017 were used for testing. Results: The RF model considered atmospheric pressure, precipitation, relative humidity, and their lagged terms, as well as 1-4 week lag morbidity and time variable as the predictors. Meanwhile, a univariate model ARIMA (1,0,1)(1,0, 0) 52 (AIC = − 575.92, BIC = − 558.14) and a multivariable model ARIMAX (1,0,1)(1,0,0) 52 with 0-1 week lag precipitation (AIC = − 578.58, BIC = − 578.13) were developed as benchmarks. The RF model outperformed the ARIMA/X models with a mean absolute percentage error (MAPE) of approximately 20%. The performance of the ARIMAX model was comparable to that of the ARIMA model with a MAPE reaching approximately 30%. Conclusions: The RF model fitted the dynamic nature of an infectious diarrhea epidemic well and delivered an ideal prediction accuracy. It comprehensively combined the synchronous and lagged effects of meteorological factors; it also integrated the autocorrelation and seasonality of the morbidity. The RF model can be used to predict the epidemic level and has a high potential for practical implementation.
Background Hand, foot and mouth disease (HFMD) is a rising public health problem and has attracted considerable attention worldwide. The purpose of this study was to develop an optimal model with meteorological factors to predict the epidemic of HFMD. Methods Two types of methods, back propagation neural networks (BP) and auto-regressive integrated moving average (ARIMA), were employed to develop forecasting models, based on the monthly HFMD incidences and meteorological factors during 2009–2016 in Jiangsu province, China. Root mean square error (RMSE) and mean absolute percentage error (MAPE) were employed to select model and evaluate the performance of the models. Results Four models were constructed. The multivariate BP model was constructed using the HFMD incidences lagged from 1 to 4 months, mean temperature, rainfall and their one order lagged terms as inputs. The other BP model was fitted just using the lagged HFMD incidences as inputs. The univariate ARIMA model was specified as ARIMA (1,0,1)(1,1,0)12 (AIC = 1132.12, BIC = 1440.43). And the multivariate ARIMAX with one order lagged temperature as external predictor was fitted based on this ARIMA model (AIC = 1132.37, BIC = 1142.76). The multivariate BP model performed the best in both model fitting stage and prospective forecasting stage, with a MAPE no more than 20%. The performance of the multivariate ARIMAX model was similar to that of the univariate ARIMA model. Both performed much worse than the two BP models, with a high MAPE near to 40%. Conclusion The multivariate BP model effectively integrated the autocorrelation of the HFMD incidence series. Meanwhile, it also comprehensively combined the climatic variables and their hysteresis effects. The introduction of the climate terms significantly improved the prediction accuracy of the BP model. This model could be an ideal method to predict the epidemic level of HFMD, which is of great importance for the public health authorities.
We depicted the epidemiological characteristics of infectious diarrhoea in Jiangsu Province, China. Generalized additive models were employed to evaluate the age-specific effects of etiological and meteorological factors on prevalence. A long-term increasing prevalence with strong seasonality was observed. In those aged 0–5 years, disease risk increased rapidly with the positive rate of virus (rotavirus, norovirus, sapovirus, astrovirus) in the 20–50% range. In those aged > 20 years, disease risk increased with the positive rate of adenovirus and bacteria (Vibrio parahaemolyticus, Salmonella, Escherichia coli, Campylobacter jejuni) until reaching 5%, and thereafter stayed stable. The mean temperature, relative humidity, temperature range, and rainfall were all related to two-month lag morbidity in the group aged 0–5 years. Disease risk increased with relative humidity between 67–78%. Synchronous climate affected the incidence in those aged >20 years. Mean temperature and rainfall showed U-shape associations with disease risk (with threshold 15 °C and 100 mm per month, respectively). Meanwhile, disease risk increased gradually with sunshine duration over 150 hours per month. However, no associations were found in the group aged 6–19 years. In brief, etiological and meteorological factors had age-specific effects on the prevalence of infectious diarrhoea in Jiangsu. Surveillance efforts are needed to prevent its spread.
Background: Infectious diarrhea can lead to a considerable global disease burden. Thus, the accurate prediction of an infectious diarrhea epidemic is crucial for public health authorities. This study was aimed at developing an optimal random forest (RF) model, considering meteorological factors used to predict an incidence of infectious diarrhea in Jiangsu Province, China. Methods: An RF model was developed and compared with classical autoregressive integrated moving average (ARIMA)/X models. Morbidity and meteorological data from 2012 to 2016 were used to construct the models and the data from 2017 were used for testing. Results: The RF model considered atmospheric pressure, precipitation, relative humidity, and their lagged terms, as well as 1–4 week lag morbidity and time variable as the predictors. Meanwhile, a univariate model ARIMA(1,0,1)(1,0,0) 52 (AIC=−575.92, BIC=−558.14) and a multivariable model ARIMAX(1,0,1)(1,0,0) 52 with 0-1 week lag precipitation (AIC=−578.58, BIC=−578.13) were developed as benchmarks . The RF model outperformed the ARIMA/X models with a mean absolute percentage error (MAPE) of approximately 20% . The performance of the ARIMAX model was comparable to that of the ARIMA model with a MAPE reaching approximately 30%. Conclusions: The RF model fitted the dynamic nature of an infectious diarrhea epidemic well and delivered an ideal prediction accuracy . It comprehensively combined the synchronous and lagged effects of meteorological factors; it also integrated the autocorrelation and seasonality of the morbidity. The RF model can be used to predict the epidemic level and has a high potential for practical implementation.
Influenza activity is subject to environmental factors. Accurate forecasting of influenza epidemics would permit timely and effective implementation of public health interventions, but it remains challenging. In this study, we aimed to develop random forest (RF) regression models including meterological factors to predict seasonal influenza activity in Jiangsu provine, China. Coefficient of determination (R2) and mean absolute percentage error (MAPE) were employed to evaluate the models' performance. Three RF models with optimum parameters were constructed to predict influenza like illness (ILI) activity, influenza A and B (Flu-A and Flu-B) positive rates in Jiangsu. The models for Flu-B and ILI presented excellent performance with MAPEs <10%. The predicted values of the Flu-A model also matched the real trend very well, although its MAPE reached to 19.49% in the test set. The lagged dependent variables were vital predictors in each model. Seasonality was more pronounced in the models for ILI and Flu-A. The modification effects of the meteorological factors and their lagged terms on the prediction accuracy differed across the three models, while temperature always played an important role. Notably, atmospheric pressure made a major contribution to ILI and Flu-B forecasting. In brief, RF models performed well in influenza activity prediction. Impacts of meteorological factors on the predictive models for influenza activity are type-specific.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.