2020
DOI: 10.1155/2020/8844367
|View full text |Cite
|
Sign up to set email alerts
|

Training and Testing Data Division Influence on Hybrid Machine Learning Model Process: Application of River Flow Forecasting

Abstract: The hydrological process has a dynamic nature characterised by randomness and complex phenomena. The application of machine learning (ML) models in forecasting river flow has grown rapidly. This is owing to their capacity to simulate the complex phenomena associated with hydrological and environmental processes. Four different ML models were developed for river flow forecasting located in semiarid region, Iraq. The effectiveness of data division influence on the ML models process was investigated. Three data d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
14
0
2

Year Published

2021
2021
2024
2024

Publication Types

Select...
9

Relationship

1
8

Authors

Journals

citations
Cited by 31 publications
(16 citation statements)
references
References 75 publications
0
14
0
2
Order By: Relevance
“…Although success has been attained in the monthly evaporation using the GBM model during the training phase, it is very essential to evaluate the proposed model with testing dataset. As is well known, the training results may provide misleading assessment because the model is trained using known input and third corresponding targets [65]. Besides, the testing phase is very crucial in assessing the quality of the predictive models and, hence, the models' abilities would be assessed very well in terms of generalization and avoiding overfitting [66].…”
Section: Resultsmentioning
confidence: 99%
“…Although success has been attained in the monthly evaporation using the GBM model during the training phase, it is very essential to evaluate the proposed model with testing dataset. As is well known, the training results may provide misleading assessment because the model is trained using known input and third corresponding targets [65]. Besides, the testing phase is very crucial in assessing the quality of the predictive models and, hence, the models' abilities would be assessed very well in terms of generalization and avoiding overfitting [66].…”
Section: Resultsmentioning
confidence: 99%
“…So, in this study, the initial point of view to select of the decomposition level was taken from L but since many seasonal characteristics may be embedded in hydrological signals, 2-8 resolution levels (L ± x) for the daily and 2-5 resolution levels (L ± x) for the monthly modeling were examined via the proposed WANN and WES models which, respectively, denote to the 2 2 -day mode and 2 3 -day mode (which is nearly weekly mode), 2 4 -day mode (which is nearly semimonthly mode), 2 5 -day mode (which is nearly monthly mode), 2 6 -day mode, 2 7 -day mode (which is nearly semiyearly mode), and 2 8 -day mode (which is nearly yearly mode) in the daily scale and 2 2 -month mode, 2 3 -month, 2 4 month, and 2 5 -month mode in the monthly scale. Besides, the Daubechies 4 wavelet (db4) that has been frequently assessed in hydrological modeling was considered as the mother wavelet in this study.…”
Section: Resultsmentioning
confidence: 99%
“…Forecasting streamflow has been investigated by several researchers [1][2][3][4][5] as it is a fundamental subject in hydrological modeling. As a result, many researchers are developing new models to improve streamflow modeling.…”
Section: Introductionmentioning
confidence: 99%
“…The models will be trained on the training set, and the fitted models will be used to estimate the predicted value in the test set, which can provide an evaluation of the models. The different splitting rate of the data set is selected in respect to the object of characteristics of the studied subjects (Tao et al 2020, Nguyen et al 2021) and the sample size (Tai et al 2019). In this study, considering that the lumber price does not fluctuate abnormally until the second half of 2020 and there are thousands of entries of samples, the splitting rate of the data set is determined to be 95 percent.…”
Section: Sample Splittingmentioning
confidence: 99%