Time series data sets often have missing or corrupted entries, which need to be handled in subsequent data analysis. For example, in the context of space physics, calibration issues, satellite telemetry issues, and unexpected events can make parts of a time series unusable. This causes problems for understanding the dynamics of the heliosphere and space weather environment. Various approaches exist to tackle this problem, including mean/median imputation, linear interpolation, and autoregressive modeling. Here, we study the utility of artificial neural networks (ANNs) to predict statistics of sparse time series. Our focus is not on time series prediction but on gleaning the best possible information about the statistical behavior of the system. As an example application, we focus on the structure functions of turbulent time series measured in the solar wind. Using a data set with artificial gaps, a neural network is trained to predict second‐order structure functions and then tested on an unseen data set to quantify its performance. A small feedforward ANN, with only 20 hidden neurons, can predict the large‐scale fluctuation amplitudes better than mean imputation or linear interpolation when the percentage of missing data is high. Although they perform worse than the other methods when it comes to capturing both the shape and fluctuation amplitude together, their performance is better in a statistical sense for large fractions of missing data. Caveats regarding their utility, the optimization procedure, and potential future improvements are discussed.
Small artificial neural networks (ANNs) are good at predicting large scale values of structure functions.• An ANN with only 20 hidden neurons statistically outperforms simple imputation techniques for large fractions of missing data. • More work is needed to improve the ANN's performance in predicting both large and small scale values.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.