2020
DOI: 10.48550/arxiv.2003.13630
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

TResNet: High Performance GPU-Dedicated Architecture

Abstract: Many deep learning models, developed in recent years, reach higher ImageNet accuracy than ResNet50, with fewer or comparable FLOPS count. While FLOPs are often seen as a proxy for network efficiency, when measuring actual GPU training and inference throughput, vanilla ResNet50 is usually significantly faster than its recent competitors, offering better throughput-accuracy trade-off.In this work, we introduce a series of architecture modifications that aim to boost neural networks' accuracy, while retaining the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
28
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 19 publications
(29 citation statements)
references
References 21 publications
1
28
0
Order By: Relevance
“…We would like to highlight a few interesting observations: (Radosavovic et al, 2020) 81.7% 39M 8B 21 -RegNetY-16GF (Radosavovic et al, 2020) 82.9% 84M 16B 32 -ResNeSt-101 (Zhang et al, 2020) 83.0% 48M 13B 31 -ResNeSt-200 (Zhang et al, 2020) 83.9% 70M 36B 76 -ResNeSt-269 (Zhang et al, 2020) 84.5% 111M 78B 160 -TResNet-L (Ridnik et al, 2020) 83.8% 56M -45 -TResNet-XL (Ridnik et al, 2020) 84.3% 78M -66 -EfficientNet-X (Li et al, 2021) 84.7% 73M 91B --NFNet-F0 (Brock et al, 2021) 83.6% 72M 12B 30 8.9 NFNet-F1 (Brock et al, 2021) 84.7% 133M 36B 70 20 NFNet-F2 (Brock et al, 2021) 85.1% 194M 63B 124 36 NFNet-F3 (Brock et al, 2021) 85.7% 255M 115B 203 65 NFNet-F4 (Brock et al, 2021) 85.9% 316M 215B 309 126 ResNet-RS 84.4% 192M 128B -61 LambdaResNet-420-hybrid 84.9% 125M --67 BotNet-T7-hybrid (Srinivas et al, 2021) 84.7% 75M 46B -95 BiT-M-R152x2 (21k) (Kolesnikov et al, 2020) 85.2% 236M 135B 500 -…”
Section: Imagenet21kmentioning
confidence: 97%
See 1 more Smart Citation
“…We would like to highlight a few interesting observations: (Radosavovic et al, 2020) 81.7% 39M 8B 21 -RegNetY-16GF (Radosavovic et al, 2020) 82.9% 84M 16B 32 -ResNeSt-101 (Zhang et al, 2020) 83.0% 48M 13B 31 -ResNeSt-200 (Zhang et al, 2020) 83.9% 70M 36B 76 -ResNeSt-269 (Zhang et al, 2020) 84.5% 111M 78B 160 -TResNet-L (Ridnik et al, 2020) 83.8% 56M -45 -TResNet-XL (Ridnik et al, 2020) 84.3% 78M -66 -EfficientNet-X (Li et al, 2021) 84.7% 73M 91B --NFNet-F0 (Brock et al, 2021) 83.6% 72M 12B 30 8.9 NFNet-F1 (Brock et al, 2021) 84.7% 133M 36B 70 20 NFNet-F2 (Brock et al, 2021) 85.1% 194M 63B 124 36 NFNet-F3 (Brock et al, 2021) 85.7% 255M 115B 203 65 NFNet-F4 (Brock et al, 2021) 85.9% 316M 215B 309 126 ResNet-RS 84.4% 192M 128B -61 LambdaResNet-420-hybrid 84.9% 125M --67 BotNet-T7-hybrid (Srinivas et al, 2021) 84.7% 75M 46B -95 BiT-M-R152x2 (21k) (Kolesnikov et al, 2020) 85.2% 236M 135B 500 -…”
Section: Imagenet21kmentioning
confidence: 97%
“…More recent works aim to improve training or inference speed instead of parameter efficiency. For example, RegNet (Radosavovic et al, 2020), ResNeSt (Zhang et al, 2020), TResNet (Ridnik et al, 2020), and EfficientNet-X (Li et al, 2021) focus on GPU and/or TPU inference speed; Lambda Networks , NFNets (Brock et al, 2021), BoTNets (Srinivas et al, 2021), ResNet-RS focus on TPU training speed. However, their training speed often comes with the cost of more parameters.…”
Section: Related Workmentioning
confidence: 99%
“…ResNet [13] is one of the most popular image classification architectures. It was a noteworthy improvement at the time it was introduced and continues to serve as the referent architecture for some analysis [8,55,56], or as a baseline in papers introducing new architectures [32,35,51,57].…”
Section: Related Workmentioning
confidence: 99%
“…MS COCO is an 80 class dataset where each image may have several labels because it contains several objects. Following the development of [2], we use TResNet as the base model [51], and threshold the vector of softmax probabilities so that the FDR is controlled at a user-specified level α. To set the threshold, we choose λ as in Algorithm 1, using 4,000 calibration points, and then we evaluate the FDR on an additional test set of 1,000 points.…”
Section: An Alternative Approach: Uniform Concentrationmentioning
confidence: 99%