2020
DOI: 10.1214/19-aos1875
|View full text |Cite
|
Sign up to set email alerts
|

Nonparametric regression using deep neural networks with ReLU activation function

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

16
379
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 334 publications
(424 citation statements)
references
References 34 publications
16
379
1
Order By: Relevance
“…This is a possible first step in theoretically understanding why deep learning is so successful empirically. Our work differs substantially from Schmidt‐Hieber (2019). First, our goal is not to demonstrate adaptation, and we do not study this property of deep nets, but focus on the common nonparametric case.…”
Section: Introductioncontrasting
confidence: 70%
See 2 more Smart Citations
“…This is a possible first step in theoretically understanding why deep learning is so successful empirically. Our work differs substantially from Schmidt‐Hieber (2019). First, our goal is not to demonstrate adaptation, and we do not study this property of deep nets, but focus on the common nonparametric case.…”
Section: Introductioncontrasting
confidence: 70%
“…An important exception is the recent work of Schmidt‐Hieber (2019), who showed that a particular deep ReLU network with uniformly bounded weights attains the optimal rate in expected risk for squared loss. Further, Schmidt‐Hieber (2019) formally shows that deep neural networks can strictly improve on classical methods: if the unknown target function is itself a composition of simpler functions, then the composition‐based deep net estimator is provably superior to estimators that do not use compositions. This is a possible first step in theoretically understanding why deep learning is so successful empirically.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…where f c t+l is the output of feature data after fully connection operation in the l th branch, σ stands for the standard ReLU function [25]. Then, a softmax layer is used to normalizes the output value f c t+l whose formula is given as,…”
Section: Users Trajectory Predictionmentioning
confidence: 99%
“…The output layer has one node that is assigned the Vth in this work. The activation function used in the algorithm is the Rectified Linear Unit (ReLU), and defined by the following equation [23,24,25]:…”
Section: Introductionmentioning
confidence: 99%