2022
DOI: 10.48550/arxiv.2211.08583
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Empirical Study on Optimizer Selection for Out-of-Distribution Generalization

Abstract: Modern deep learning systems are fragile and do not generalize well under distribution shifts. While much promising work has been accomplished to address these concerns, a systematic study of the role of optimizers and their out-of-distribution generalization performance has not been undertaken. In this study, we examine the performance of popular first-order optimizers for different classes of distributional shift under empirical risk minimization and invariant risk minimization. We address the problem settin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 12 publications
(20 reference statements)
0
1
0
Order By: Relevance
“…Interestingly, the analysis also proves that the choice of optimization algorithm did not significantly impact the model's performance. Both Adam and RMSprop optimizers were used in training the architectures, but neither seemed to contribute significantly to the observed variations in accuracy or loss [52]. This finding implies that other factors, such as the architecture itself and the choice of learning rate, played a more substantial role in determining the model's performance.…”
Section: Discussionmentioning
confidence: 98%
“…Interestingly, the analysis also proves that the choice of optimization algorithm did not significantly impact the model's performance. Both Adam and RMSprop optimizers were used in training the architectures, but neither seemed to contribute significantly to the observed variations in accuracy or loss [52]. This finding implies that other factors, such as the architecture itself and the choice of learning rate, played a more substantial role in determining the model's performance.…”
Section: Discussionmentioning
confidence: 98%