2019
DOI: 10.48550/arxiv.1903.01997
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Implicit Regularization in Over-parameterized Neural Networks

Abstract: Over-parameterized neural networks generalize well in practice without any explicit regularization. Although it has not been proven yet, empirical evidence suggests that implicit regularization plays a crucial role in deep learning and prevents the network from overfitting. In this work, we introduce the gradient gap deviation and the gradient deflection as statistical measures corresponding to the network curvature and the Hessian matrix to analyze variations of network derivatives with respect to input param… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 40 publications
0
7
0
Order By: Relevance
“…More recently, often motivated by neural networks, there has been work on implicit regularization that typically considered SGD-based optimization algorithms. See, e.g., theoretical results on simplified models (Neyshabur et al, 2014;Neyshabur, 2017;Soudry et al, 2018;Gunasekar et al, 2017;Arora et al, 2019;Kubo et al, 2019) as well as extensive empirical and phenomenological results on stateof-the-art neural network models Mahoney, 2018, 2019). The implicit regularization observed by us is different in that it is not caused by an inexact approximation algorithm (such as SGD) but rather by the selection of one out of many exact solutions (e.g., the minimum norm solution).…”
Section: Related Workmentioning
confidence: 99%
“…More recently, often motivated by neural networks, there has been work on implicit regularization that typically considered SGD-based optimization algorithms. See, e.g., theoretical results on simplified models (Neyshabur et al, 2014;Neyshabur, 2017;Soudry et al, 2018;Gunasekar et al, 2017;Arora et al, 2019;Kubo et al, 2019) as well as extensive empirical and phenomenological results on stateof-the-art neural network models Mahoney, 2018, 2019). The implicit regularization observed by us is different in that it is not caused by an inexact approximation algorithm (such as SGD) but rather by the selection of one out of many exact solutions (e.g., the minimum norm solution).…”
Section: Related Workmentioning
confidence: 99%
“…Remark 4.2. The concept of (implicit) regularization has been adopted in many recent studies on nonconvex optimization, including training neural networks (Allen-Zhu et al, 2018;Kubo et al, 2019), phase retrieval (Chen and Candes, 2015;Ma et al, 2017), matrix completion (Chen and Wainwright, 2015;Zheng and Lafferty, 2016), and blind deconvolution (Li et al, 2019), referring to any scheme that biases the search direction of gradient-based algorithms. Implicit regularization has been advocated as an important feature of (stochastic) gradient descent methods for solving these problems, which, as the name suggests, means that the algorithms without regularization may behave as if they are regularized.…”
Section: Implicit Regularizationmentioning
confidence: 99%
“…These techniques improve the generalization ability by preventing fine-tuning in a pre-determined DNN architecture [7]. Regularization techniques have been studied from different perspectives [10][11][12]. Accordingly, regularization techniques have two main effects: explicit and implicit.…”
Section: Related Work On Regularizationmentioning
confidence: 99%