Towards Theoretical Analysis of Transformation Complexity of ReLU DNNs

Ren, Jie; Li, Mingjie; Zhang, Meng; Chan, Shih-Han; Zhang, Quanshi

doi:10.48550/arxiv.2205.01940

Cited by 1 publication

(2 citation statements)

References 33 publications

(67 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To this end, we learned four types of ReLU networks on the MNIST dataset via adversarial training. We followed settings in [19] to construct five MLPs, five CNNs, three MLPs with skip connections, and three CNNs with skip connections, respectively. Experimental results show that the average κ over all sixteen networks was 0.097, which verified the correctness of Theorem 3.…”

Section: Explaining the Difficulty Of Adversarial Trainingmentioning

confidence: 99%

“…To this end, we learned four types of ReLU networks, including MLPs, CNNs, MLPs with skip connections (namely ResMLP), and CNNs with skip connections (namely ResCNN), on the MNIST dataset [11] via adversarial training. Here, we followed settings in [19] to construct five different MLPs, which consisted of 1, 2, 3, 4, 5 fully-connected (FC) layers, respectively. Each FC layer contained 200 neurons.…”

Section: G Proof Of Theoremmentioning

confidence: 99%

See 1 more Smart Citation

Why Adversarial Training of ReLU Networks Is Difficult?

Xu¹,

Zhang²,

Yue³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

This paper mathematically derives an analytic solution of the adversarial perturbation on a ReLU network, and theoretically explains the difficulty of adversarial training. Specifically, we formulate the dynamics of the adversarial perturbation generated by the multi-step attack, which shows that the adversarial perturbation tends to strengthen eigenvectors corresponding to a few top-ranked eigenvalues of the Hessian matrix of the loss w.r.t. the input. We also prove that adversarial training tends to strengthen the influence of unconfident input samples with large gradient norms in an exponential manner. Besides, we find that adversarial training strengthens the influence of the Hessian matrix of the loss w.r.t. network parameters, which makes the adversarial training more likely to oscillate along directions of a few samples, and boosts the difficulty of adversarial training. Crucially, our proofs provide a unified explanation for previous findings in understanding adversarial training [13,

show abstract

Section: Explaining the Difficulty Of Adversarial Trainingmentioning

confidence: 99%

Section: G Proof Of Theoremmentioning

confidence: 99%