Abstract:This paper aims to theoretically analyze the complexity of feature transformations encoded in DNNs with ReLU layers. We propose metrics to measure three types of complexities of transformations based on the information theory. We further discover and prove the strong correlation between the complexity and the disentanglement of transformations. Based on the proposed metrics, we analyze two typical phenomena of the change of the transformation complexity during the training process, and explore the ceiling of a… Show more
“…To this end, we learned four types of ReLU networks on the MNIST dataset via adversarial training. We followed settings in [19] to construct five MLPs, five CNNs, three MLPs with skip connections, and three CNNs with skip connections, respectively. Experimental results show that the average κ over all sixteen networks was 0.097, which verified the correctness of Theorem 3.…”
Section: Explaining the Difficulty Of Adversarial Trainingmentioning
confidence: 99%
“…To this end, we learned four types of ReLU networks, including MLPs, CNNs, MLPs with skip connections (namely ResMLP), and CNNs with skip connections (namely ResCNN), on the MNIST dataset [11] via adversarial training. Here, we followed settings in [19] to construct five different MLPs, which consisted of 1, 2, 3, 4, 5 fully-connected (FC) layers, respectively. Each FC layer contained 200 neurons.…”
This paper mathematically derives an analytic solution of the adversarial perturbation on a ReLU network, and theoretically explains the difficulty of adversarial training. Specifically, we formulate the dynamics of the adversarial perturbation generated by the multi-step attack, which shows that the adversarial perturbation tends to strengthen eigenvectors corresponding to a few top-ranked eigenvalues of the Hessian matrix of the loss w.r.t. the input. We also prove that adversarial training tends to strengthen the influence of unconfident input samples with large gradient norms in an exponential manner. Besides, we find that adversarial training strengthens the influence of the Hessian matrix of the loss w.r.t. network parameters, which makes the adversarial training more likely to oscillate along directions of a few samples, and boosts the difficulty of adversarial training. Crucially, our proofs provide a unified explanation for previous findings in understanding adversarial training [13,
“…To this end, we learned four types of ReLU networks on the MNIST dataset via adversarial training. We followed settings in [19] to construct five MLPs, five CNNs, three MLPs with skip connections, and three CNNs with skip connections, respectively. Experimental results show that the average κ over all sixteen networks was 0.097, which verified the correctness of Theorem 3.…”
Section: Explaining the Difficulty Of Adversarial Trainingmentioning
confidence: 99%
“…To this end, we learned four types of ReLU networks, including MLPs, CNNs, MLPs with skip connections (namely ResMLP), and CNNs with skip connections (namely ResCNN), on the MNIST dataset [11] via adversarial training. Here, we followed settings in [19] to construct five different MLPs, which consisted of 1, 2, 3, 4, 5 fully-connected (FC) layers, respectively. Each FC layer contained 200 neurons.…”
This paper mathematically derives an analytic solution of the adversarial perturbation on a ReLU network, and theoretically explains the difficulty of adversarial training. Specifically, we formulate the dynamics of the adversarial perturbation generated by the multi-step attack, which shows that the adversarial perturbation tends to strengthen eigenvectors corresponding to a few top-ranked eigenvalues of the Hessian matrix of the loss w.r.t. the input. We also prove that adversarial training tends to strengthen the influence of unconfident input samples with large gradient norms in an exponential manner. Besides, we find that adversarial training strengthens the influence of the Hessian matrix of the loss w.r.t. network parameters, which makes the adversarial training more likely to oscillate along directions of a few samples, and boosts the difficulty of adversarial training. Crucially, our proofs provide a unified explanation for previous findings in understanding adversarial training [13,
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.