“…Methods vary across these studies and include grid search [47,48], Bayesian methods [49][50][51] (one fails to report the specific approach [49]), trial and error [24,52], and unstated approaches, likely indicating trial and error [53,54]. In six of these studies, only partial results are reported in relation to HPs [24,48,50,51,53,54]. For example, in an otherwise excellent paper [50], only present optimal values for structural parameters were tested and the authors completely fail to report on the effects of optimizing learning rate and learning rate decay.…”