“…The algorithm, while keeping the time spent on robust training almost equal to the non-robust ones, also produces robust models. Instead of perturbing the dataset, we apply the Snapshot Ensemble [Huang et al, 2017a, Smith, 2017, Loshchilov and Hutter, 2016 along the training process to store multiple historical weights so as to defend against adversarial attacks such as FGSM or PGD, in a Bayesian Neural Network [Blundell et al, 2015, Kingma et al, 2015, Ru et al, 2019, Zhang et al, 2021 manner. The proposed method produces results as fast as the non-robust networks, with only a few seconds difference when trained on CIFAR-10 [ Krizhevsky et al, 2009] and MNIST [LeCun, 1998] datasets.…”