We propose a novel data-driven machine learning method using long short-term memory (LSTM)-based multi-stage forecasting for influenza forecasting. The novel aspects of the method include the following: 1) the introduction of LSTM method to capture the temporal dynamics of seasonal flu and 2) a technique to capture the influence of external variables that includes the geographical proximity and climatic variables such as humidity, temperature, precipitation, and sun exposure. The proposed model is compared against two state-of-the-art techniques using two publicly available datasets. Our proposed method performs better than the existing well-known influenza forecasting methods. The results offer a promising direction in terms of both using the data-driven forecasting methods and capturing the influence of spatio-temporal and environmental factors to improve influenza forecasting.INDEX TERMS Influenza forecasting, LSTM, recurrent neural networks, spatio-temporal data, time series forecasting.
Abstract-We provide data-driven machine learning methods that are capable of making real-time influenza forecasts that integrate the impacts of climatic factors and geographical proximity to achieve better forecasting performance. The key contributions of our approach are both applying deep learning methods and incorporation of environmental and spatio-temporal factors to improve the performance of the influenza forecasting models. We evaluate the method on Influenza Like Illness (ILI) counts and climatic data, both publicly available data sets. Our proposed method outperforms existing known influenza forecasting methods in terms of their Mean Absolute Percentage Error and Root Mean Square Error. The key advantages of the proposed data-driven methods are as following: (1) The deep-learning model was able to effectively capture the temporal dynamics of flu spread in different geographical regions, (2) The extensions to the deep-learning model capture the influence of external variables that include the geographical proximity and climatic variables such as humidity, temperature, precipitation and sun exposure in future stages, (3) The model consistently performs well for both the city scale and the regional scale on the Google Flu Trends (GFT) and Center for Disease Control (CDC) flu counts. The results offer a promising direction in terms of both datadriven forecasting methods and capturing the influence of spatio-temporal and environmental factors for influenza forecasting methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.