Abstract.A comprehensive data driven modeling experiment is presented in a two-part paper. In this first part, an extensive data-driven modeling experiment is proposed. The most important concerns regarding the way data driven modeling (DDM) techniques and data were handled, compared, and evaluated, and the basis on which findings and conclusions were drawn are discussed. A concise review of key articles that presented comparisons among various DDM techniques is presented. Six DDM techniques, namely, neural networks, genetic programming, evolutionary polynomial regression, support vector machines, M5 model trees, and K-nearest neighbors are proposed and explained. Multiple linear regression and naïve models are also suggested as baseline for comparison with the various techniques. Five datasets from Canada and Europe representing evapotranspiration, upper and lower layer soil moisture content, and rainfall-runoff process are described and proposed, in the second paper, for the modeling experiment. Twelve different realizations (groups) from each dataset are created by a procedure involving random sampling. Each group contains three subsets; training, cross-validation, and testing. Each modeling technique is proposed to be applied to each of the 12 groups of each dataset. This way, both prediction accuracy and uncertainty of the modeling techniques can be evaluated. The description of the datasets, the implementation of the modeling techniques, results and analysis, and the findings of the modeling experiment are deferred to the second part of this paper.
In hydrological sciences there is an increasing tendency to explore and improve artificial neural network (ANN) and other data-driven forecasting models. Attempts to improve such models relate, to a large extent, to the recognized problems of their physical interpretation. The present paper deals with the problem of incorporating hydrological knowledge into the modelling process through the use of a modular architecture that takes into account the existence of various flow regimes. Three different partitioning schemes were employed: automatic classification based on clustering, temporal segmentation of the hydrograph based on an adapted baseflow separation technique, and an optimized baseflow separation filter. Three different model performance measures were analysed. Three case studies were considered. The modular models incorporating hydrological knowledge were shown to be more accurate than the traditional ANN-based models.
Abstract. In this second part of the two-part paper, the data driven modeling (DDM) experiment, presented and explained in the first part, is implemented. Inputs for the five case studies (half-hourly actual evapotranspiration, daily peat soil moisture, daily till soil moisture, and two daily rainfall-runoff datasets) are identified, either based on previous studies or using the mutual information content. Twelve groups (realizations) were randomly generated from each dataset by randomly sampling without replacement from the original dataset. Neural networks (ANNs), genetic programming (GP), evolutionary polynomial regression (EPR), Support vector machines (SVM), M5 model trees (M5), K-nearest neighbors (K-nn), and multiple linear regression (MLR) techniques are implemented and applied to each of the 12 realizations of each case study. The predictive accuracy and uncertainties of the various techniques are assessed using multiple average overall error measures, scatter plots, frequency distribution of model residuals, and the deterioration rate of prediction performance during the testing phase. Gamma test is used as a guide to assist in selecting the appropriate modeling technique. Unlike two nonlinear soil moisture case studies, the results of the experiment conducted in this research study show that ANNs were a sub-optimal choice for the actual evapotranspiration and the two rainfallrunoff case studies. GP is the most successful technique due to its ability to adapt the model complexity to the modeled data. EPR performance could be close to GP with datasets that are more linear than nonlinear. SVM is sensitive to the kernel choice and if appropriately selected, the performance of SVM can improve. M5 performs very well with linear Correspondence to: A. Elshorbagy (amin.elshorbagy@usask.ca) and semi linear data, which cover wide range of hydrological situations. In highly nonlinear case studies, ANNs, K-nn, and GP could be more successful than other modeling techniques. K-nn is also successful in linear situations, and it should not be ignored as a potential modeling technique for hydrological applications.
Abstract. The recent concerns for world-wide extreme events related to climate change have motivated the development of large scale models that simulate the global water cycle. In this context, analysis of hydrological extremes is important and requires the adaptation of identification methods used for river basin models. This paper presents two methodologies that extend the tools to analyze spatio-temporal drought development and characteristics using large scale gridded time series of hydrometeorological data. The methodologies are classified as non-contiguous and contiguous drought area analyses (i.e. NCDA and CDA). The NCDA presents time series of percentages of areas in drought at the global scale and for pre-defined regions of known hydroclimatology. The CDA is introduced as a complementary method that generates information on the spatial coherence of drought events at the global scale. Spatial drought events are found through CDA by clustering patterns (contiguous areas). In this study the global hydrological model WaterGAP was used to illustrate the methodology development. Global gridded time series of subsurface runoff (resolution 0.5 • ) simulated with the WaterGAP model from land points were used. The NCDA and CDA were developed to identify drought events in runoff. The percentages of area in drought calculated with both methods show complementary information on the spatial and temporal events for the last decades of the 20th century. The NCDA provides relevant information on the average number of droughts, duration and severity (deficit volume) for predefined regions (globe, 2 selected hydroclimatic regions). Additionally, the CDA provides information on the number
Characterizing the response of a catchment to rainfall, in terms of the production of runoff vs the interception, transpiration and evaporation of water, is the first important step in understanding water resource availability in a catchment. This is particularly important in small semi-arid catchments, where a few intense rainfall events may generate much of the season's runoff. The ephemeral Zhulube catchment (30 km 2 ) in the northern Limpopo basin was instrumented and modelled in order to elucidate the dominant hydrological processes. Discharge events were disconnected, with short recession curves, probably caused by the shallow soils in the Tshazi sub-catchment, which dry out rapidly, and the presence of a dambo in the Gobalidanke sub-catchment. Two different flow event types were observed, with the larger floods showing longer recessions being associated with higher (antecedent) precipitation. The differences could be related to: (a) intensity of rainfall, or (b) different soil conditions. Interception is an important process in the water balance of the catchment, accounting for an estimated 32% of rainfall in the 2007/08 season, but as much as 56% in the drier 2006/07 season. An extended version of the HBV model was developed (designated HBVx), introducing an interception storage and with all routines run in semi-distributed mode. After extensive manual calibration, the HBVx simulation satisfactorily showed the disconnected nature of the flows. The generally low Nash-Sutcliffe coefficients can be explained by the model failing to simulate the two different observed flow types differently. The importance of incorporating interception into rainfall-runoff is demonstrated by the substantial improvement in objective function values obtained. This exceeds the gains made by changing from lumped to semi-distributed mode, supported by 1 000 000 Monte Carlo simulations. There was also an important improvement in the daily volume error. The best simulation, supported by field observations in the Gobalidanke sub-catchment, suggested that discharge was driven mainly by flow from saturation overland flow. Hortonian overland flow, as interpreted from field observations in the Tshazi subcatchment, was not simulated so well. A limitation of the model is its inability to address temporal variability in soil characteristics and more complex runoff generation processes. The model suggests episodic groundwater recharge with annual recharge of 100 mm year -1 , which is similar to that reported by other studies in Zimbabwe.Key words HBV; interception; Limpopo basin; semi-arid hydrologyRelations précipitation-interception-évaporation-écoulement dans un bassin versant semi-aride (nord du Limpopo, Zimbabwe) Résumé La caractérisation de la réponse d'un bassin versant aux précipitations, en termes de production d'écoulement par rapport à l'interception, la transpiration et l'évaporation, est un premier pas important pour la compréhension de la disponibilité des ressources en eau dans un bassin. Ceci est particulièrement im...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.