2020
DOI: 10.48550/arxiv.2006.05624
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Adjoined Networks: A Training Paradigm with Applications to Network Compression

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 17 publications
0
1
0
Order By: Relevance
“…It would take them more than 3000 GPU days to achieve state-of-the-art performance on the Ima-geNet dataset. Most recent studies [2,29,31,53,55] encode architectures as a weight-sharing super-net and optimize the weights using gradient descent. Architectures found by NAS exhibit two significant advantages.…”
Section: Introductionmentioning
confidence: 99%
“…It would take them more than 3000 GPU days to achieve state-of-the-art performance on the Ima-geNet dataset. Most recent studies [2,29,31,53,55] encode architectures as a weight-sharing super-net and optimize the weights using gradient descent. Architectures found by NAS exhibit two significant advantages.…”
Section: Introductionmentioning
confidence: 99%