Predicting groundwater availability is important to water sustainability and drought mitigation. Machine-learning tools have the potential to improve groundwater prediction, thus enabling resource planners to: (1) anticipate water quality in unsampled areas or depth zones; (2) design targeted monitoring programs; (3) inform groundwater protection strategies; and (4) evaluate the sustainability of groundwater sources of drinking water. This paper proposes a machine-learning approach to groundwater prediction with the following characteristics: (i) the use of a regression-based approach to predict full groundwater images based on sequences of monthly groundwater maps; (ii) strategic automatic feature selection (both local and global features) using extreme gradient boosting; and (iii) the use of a multiplicity of machine-learning techniques (extreme gradient boosting, multivariate linear regression, random forests, multilayer perceptron and support vector regression). Of these techniques, support vector regression consistently performed best in terms of minimizing root mean square error and mean absolute error. Furthermore, including a global feature obtained from a Gaussian Mixture Model produced models with lower error than the best which could be obtained with local geographical features.
Machine learning (ML) has been utilized to predict climatic parameters, and many successes have been reported in the literature. In this paper, we scrutinize the effectiveness of five widely used ML algorithms in the monthly prediction of seasonal climatic parameters using monthly image data. Specifically, we quantify the predictive performance of these algorithms applied to five climatic parameters using various combinations of features. We compare the predictive accuracy of the resulting trained ML models to that of basic statistical estimators that are computed directly from the training data. Our results show that ML never significantly outperforms the statistical baseline, and underperforms for most feature sets. Unlike previous similar studies, we provide error bars for the relative performance of different predictors based on jackknife estimates applied to differences in predictive error magnitudes. We also show that the practice of shuffling data sequences which was employed in some previous references leads to data leakage, resulting in over-estimated performance. Ultimately, the paper demonstrates the importance of using well-grounded statistical techniques when producing and analyzing the results of ML predictive models.
This paper provides a review of past approaches to the use of deep-learning frameworks for the analysis of discrete irregularpatterned complex sequential datasets. A typical example of such a dataset is financial data where specific events trigger sudden irregular changes in the sequence of the data. Traditional deep-learning methods perform poorly or even fail when trying to analyse these datasets. The results of a systematic literature review reveal the dominance of frameworks based on recurrent neural networks. The performance of deep-learning frameworks was found to be evaluated mainly using mean absolute error and root mean square error accuracy metrics. Underlying challenges that were identified are: lack of performance robustness, non-transparency of the methodology, internal and external architectural design and configuration issues. These challenges provide an opportunity to improve the framework for complex irregular-patterned sequential datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.