“…Although, as a criterion for the selection of the best model, the minimization of some error function is often used, such as mean square error (MSE), absolute average deviation (MAD), cost functions [51], or even expert knowledge [52], because the performance of each measure is not the same, since they can favor or penalize certain characteristics in the data, and that, in the case of expert knowledge is not always easy to acquire; approaches based on the use of machine learning [53,54] and meta-learning [55][56][57][58][59] have been reported in the literature, which show advantages by allowing an automatic process of model selection based on the parallel evaluation of multiple network architectures, but they are limited to the execution of certain architectures and their implementation is complex. Other studies related to the topic include Qi and Zhang [43] who investigate the well-known criteria of AIC [60], BIC [61], square root of the mean square error (RMSE), absolute average percentage deviation (MAPE), and direction of occurrence (DA).…”