“…We explored various hyperparameters of autoencoder, RNVP, BCLASS, WRN, and RNVP+OE. Most of the hyperparameters were set using a preliminary search or were inspired by previous research [1], [3], [5], [6]: the autoencoder's bottleneck size, learning rate, maximum number of epochs, input size and architecture details; the RNVP input size, learning rate, coupling layer size and number, and input masking; WRN's architecture and hyperparameters. For all experiments except AE and WRN+OE, we let the model run for 500 epochs, then choose the best performing model over the validation set.…”