2019
DOI: 10.1016/j.patcog.2019.01.006
|View full text |Cite
|
Sign up to set email alerts
|

Wider or Deeper: Revisiting the ResNet Model for Visual Recognition

Abstract: The trend towards increasingly deep neural networks has been driven by a general observation that increasing depth increases the performance of a network. Recently, however, evidence has been amassing that simply increasing depth may not be the best way to increase performance, particularly given other limitations. Investigations into deep residual networks have also suggested that they may not in fact be operating as a single deep network, but rather as an ensemble of many relatively shallow networks. We exam… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

6
545
0
2

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
2

Relationship

2
7

Authors

Journals

citations
Cited by 1,106 publications
(553 citation statements)
references
References 43 publications
6
545
0
2
Order By: Relevance
“…In this experiment, following previous works [31,36,34] without COCO pre-training, we train our model on SBD [10] and then fine-tune it on official trainval set. We use the same training protocol as described in the main paper.…”
Section: Pascal Voc Without Coco Pre-trainingmentioning
confidence: 99%
See 1 more Smart Citation
“…In this experiment, following previous works [31,36,34] without COCO pre-training, we train our model on SBD [10] and then fine-tune it on official trainval set. We use the same training protocol as described in the main paper.…”
Section: Pascal Voc Without Coco Pre-trainingmentioning
confidence: 99%
“…Method mIOU (%) DPN [20] 74.1 Piecewise [17] 75.3 ResNet-38 [31] 82.5 PSPNet [36] 82.6 DFN [32] 82.7 EncNet [34] 82.9 Our proposed (Xception-65) 85.3 Table 7: State-of-the-art methods on PASCAL VOC test set without COCO pre-training.…”
Section: Pascal Voc Without Coco Pre-trainingmentioning
confidence: 99%
“…Gensim package (Version 3.4) [40] was used in training the Word2Vec model. [41] & Inception [42]. The tested CNN architectures use 1D convolution instead of 2D convolution.…”
Section: E Word2vec Model Trainingmentioning
confidence: 99%
“…To illustrate this dilemma, Fig. 1 gives the accuracy (mIoU) and inference speed (frames per second (fps)) obtained by several state-of-the-art methods, including FCN-8s [9], CRF-RNN [17], DeepLab [10], DeepLabv2 [12], DeepLabv3+ [13], ResNet-38 [18], PSPNet [11], DUC [19], RefineNet [20], LRR [21], DPN [22], FRRN [23], TwoColumn [24], SegNet [25], SQNet [26], ENet [27], arXiv:2003.08736v2 [cs.CV] 3 Apr 2020 ERFNet [28], ICNet [29], SwiftNetRN [30], LEDNet [31], BiSeNet1 [32], BiSeNet2 [32], DFANet [33] and our proposed method, on the Cityscapes test dataset. Clearly, how to achieve a good tradeoff between accuracy and speed is still a challenging problem.…”
Section: Introductionmentioning
confidence: 99%