2020
DOI: 10.1136/bmjopen-2020-039676
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of ARIMA model and XGBoost model for prediction of human brucellosis in mainland China: a time-series study

Abstract: ObjectivesHuman brucellosis is a public health problem endangering health and property in China. Predicting the trend and the seasonality of human brucellosis is of great significance for its prevention. In this study, a comparison between the autoregressive integrated moving average (ARIMA) model and the eXtreme Gradient Boosting (XGBoost) model was conducted to determine which was more suitable for predicting the occurrence of brucellosis in mainland China.DesignTime-series study.SettingMainland China.Method… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
40
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 56 publications
(49 citation statements)
references
References 43 publications
1
40
0
Order By: Relevance
“…It is good at dealing with nonlinear data but has poor interpretability. From studies in other fields, the XGBoost model performed well in predicting nonlinear time series [ 28 31 ]. By integrating multiple CART models, XGBoost model can achieve a better generalizability than a single model, which means that the XGBoost has a larger postpruning penalty than a GBDT model and makes the learned model less prone to overfitting.…”
Section: Discussionmentioning
confidence: 99%
“…It is good at dealing with nonlinear data but has poor interpretability. From studies in other fields, the XGBoost model performed well in predicting nonlinear time series [ 28 31 ]. By integrating multiple CART models, XGBoost model can achieve a better generalizability than a single model, which means that the XGBoost has a larger postpruning penalty than a GBDT model and makes the learned model less prone to overfitting.…”
Section: Discussionmentioning
confidence: 99%
“…As our time-series data showed stationary characteristics without seasonality, which means that the mean, variance, and covariance of data were invariant to time, in the augmented Dickey-Fuller test (p < 0.001) [66], we applied the non-seasonal ARIMA (1, 0, 1) model using the following parameters, with the lowest Akaike Information Criterion value (388. As our time-series data showed stationary characteristics without seasonality, which means that the mean, variance, and covariance of data were invariant to time, in the augmented Dickey-Fuller test (p < 0.001) [66], we applied the non-seasonal ARIMA (1, 0, 1) model using the following parameters, with the lowest Akaike Information Criterion value (388.7): (1) 1 of autoregression (p) from the autocorrelation function of residuals, (2) 0 of degree of differencing (integrated, d), and (3) 1 of size of the moving average window (q) from the partial autocorrelation function of residuals (Supplementary Figure S3) [59,67,68]. As the observed PCP-confirmed cases did not exist for several months, and average numbers of observed PCP-confirmed cases per month in each year were very small in the pre-and post-COVID-19 periods, we did not perform the time-series analysis for HSCT recipients, chronic lung disease, and HIV-1-infected individuals in the ARIMA and BSTS model.…”
Section: Clinical Information Of Total Pcp-confirmed Inpatientsmentioning
confidence: 99%
“…In this paper, we used two machine learning methods, Random Forest and XGBoost, to perform our analysis and prediction on HFMD incidence. There have been a large number of studies focusing on machine learning methods to analyze different infectious diseases, and to perform prediction about the incidence of diseases, such as dengue [ 30 , 33 35 ], polio [ 36 ], human brucellosis [ 37 ], malaria [ 38 ], and COVID-19 [ 39 , 40 ]. It is also applied for the prediction of HFMD from meteorological factors in a single province in China [ 41 ].…”
Section: Discussionmentioning
confidence: 99%
“…It is also applied for the prediction of HFMD from meteorological factors in a single province in China [ 41 ]. Several studies using machine learning methods compared the performance of different prediction methods [ 30 , 34 , 37 ]. A study on human brucellosis in mainland China stated that XGBoost model is more suitable for prediction cases of human brucellosis in mainland China than ARIMA model [ 37 ].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation