“…Training instability examples include the cost or reward function prematurely becoming constant, a cost function that suddenly starts to increase with continuous acceleration, etc. Batch normalization [16], [63], [92], [123], [124], [128], [138], [192], initializing the model parameters prior to training with Xavier initialization [126]- [128], [192], making changes to existing model types with for example the ResNet layer [92], [123] or LeakyRELU activation function [48], [53], [77], [138], and parameter update averaging over multiple training cycles with momentum [120], [125], [181], [193] have been proposed to solve the problem of training instability.…”