2021
DOI: 10.1016/j.neunet.2021.02.012
|View full text |Cite
|
Sign up to set email alerts
|

Fast convergence rates of deep neural networks for classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
50
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 42 publications
(55 citation statements)
references
References 19 publications
3
50
0
Order By: Relevance
“…When q = 0 (high noise), the convergence rate with respect to the sample size is n −α/(0.75d+2α) , which is exactly the same as Modified Logistic or exponential examples; When q = +∞ (no noise), the rate will be significantly improved to n −1/2 . A similar result can be found in Theorem 3.3 of Kim et al (2021).…”
Section: Example 4: Svmsupporting
confidence: 83%
See 2 more Smart Citations
“…When q = 0 (high noise), the convergence rate with respect to the sample size is n −α/(0.75d+2α) , which is exactly the same as Modified Logistic or exponential examples; When q = +∞ (no noise), the rate will be significantly improved to n −1/2 . A similar result can be found in Theorem 3.3 of Kim et al (2021).…”
Section: Example 4: Svmsupporting
confidence: 83%
“…When the inputs are assumed to be uniformly distributed on the surface of a sphere, Kalai et al (2008) derived non-asymptotic bounds for efficient binary prediction with half spaces by minimizing the misclassification error rate directly. Kim et al (2021) studied the excess risk of empirical risk minimizer for classification under the hinge loss (SVM) using deep neural networks. They aimed to establish the convergence rate under the Tsybakov noise condition (Mammen and Tsybakov, 1999;Tsybakov, 2004) in three different cases: smooth decision boundary, smooth conditional class probability η and margin conditions.…”
Section: Error Bounds In Regression and Classificationmentioning
confidence: 99%
See 1 more Smart Citation
“…Schmidt-Hieber (2020) shows minimax-optimality of s-sparse neural networks for regression over Hölder classes, where at most s = O(n log n) network weights are nonzero, and n = the number of training samples. Kim et al (2021) extends the results of Schmidt-Hieber (2020) to the classification setting, remarking that effective optimization under sparsity constraint is lacking. Kohler & Langer (2020) and Langer (2021) proved minimax-optimality without the sparsity assumption, however in an underparametrized setting.…”
Section: Related Worksupporting
confidence: 57%
“…Our result is the first adaptive optimal one for the fully connected (i.e., non-sparse) neural network models. Many of the related studies showing minimax optimality of neural network estimators assumed sparsity of the neural networks (in both adaptive and nonadaptive fashions) to control estimation variance while maintain high approximation ability, for example, Schmidt-Hieber [2020], Suzuki [2019], Imaizumi and Fukumizu [2020], Ohn and Kim [2020] and Kim et al [2021]. However, to find an optimal sparse neural network is computationally expensive since we need to explore all possible zero-one configurations of the network parameter.…”
Section: Oracle Contraction Rate For Regression Function Estimationmentioning
confidence: 99%