“…For example, the weight initialization, batch sizes, epochs, learning rates, activation function, optimizer, loss function, network topology, etc. The optimizer selection study of [27] for brain tumor segmentation in magnetic resonance images (MRI) suggests that a good optimizer could be a critical issue for the proposed approach. The authors of [27] listed 10 different state-of-the-art optimizer including: adaptive gradient (Adagrad), adaptive delta (AdaDelta), stochastic gradient descent (SGD), adaptive momentum (Adam), cyclic learning rate (CLR), adaptive max pooling (Adamax), root mean square propagation (RMS Prop), Nesterov adaptive momentum (Nadam), and Nesterov accelerated gradient (NAG) for CNN.…”