2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA) 2019
DOI: 10.1109/dsaa.2019.00017
|View full text |Cite
|
Sign up to set email alerts
|

Residual Networks Behave Like Boosting Algorithms

Abstract: We show that Residual Networks (ResNet) is equivalent to boosting feature representation, without any modification to the underlying ResNet training algorithm. A regret bound based on Online Gradient Boosting theory is proved and suggests that ResNet could achieve Online Gradient Boosting regret bounds through neural network architectural changes with the addition of a shrinkage parameter in the identity skip-connections and using residual modules with max-norm bounds. Through this relation between ResNet and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 14 publications
0
5
0
Order By: Relevance
“…Literature shows that the boosting concept is the backbone behind well-known architectures like Deep Residual networks (He et al, 2016;Siu, 2019), AdaNet (Cortes et al, 2017) . The theoretical background for the success of the Deep Residual networks (DeepResNet) (He et al, 2016) was explained in the context of boosting theory (Huang et al, 2018).…”
Section: Boostingmentioning
confidence: 99%
“…Literature shows that the boosting concept is the backbone behind well-known architectures like Deep Residual networks (He et al, 2016;Siu, 2019), AdaNet (Cortes et al, 2017) . The theoretical background for the success of the Deep Residual networks (DeepResNet) (He et al, 2016) was explained in the context of boosting theory (Huang et al, 2018).…”
Section: Boostingmentioning
confidence: 99%
“…As shown in Figure 4, the residual connection is a simple shortcut connection structure that connects the inputs and outputs of GRU layer. Existing studies show that residual networks behave like boosting algorithms, where the main idea is to combines different weaker learners into a stronger learner 30 . We argue that the residual connection in GRRLN would help to boost the statement feature extraction ability of the basic GRU networks.…”
Section: Methodology Overviewmentioning
confidence: 87%
“…Existing studies show that residual networks behave like boosting algorithms, where the main idea is to combines different weaker learners into a stronger learner. 30 We argue that the residual connection in GRRLN would help to boost the statement feature extraction ability of the basic GRU networks. Intuitively, the residual connection will retain most features in the data, thus forcing the GRU model to focus more on the different features of source code.…”
Section: Gru Layer With Residual Connectionmentioning
confidence: 96%
“…To make BIER more robust, Hierarchical Boosted deep metric learning [69] Literature shows that the boosting concept is the backbone behind well-known architectures like Deep Residual networks [13,72], AdaNet [73] . The theoretical background for the success of the Deep Residual networks (DeepResNet) [13] was explained in the context of boosting theory [74].…”
Section: Boostingmentioning
confidence: 99%