2019
DOI: 10.1007/s11263-019-01176-2
|View full text |Cite
|
Sign up to set email alerts
|

Which and How Many Regions to Gaze: Focus Discriminative Regions for Fine-Grained Visual Categorization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
30
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 72 publications
(33 citation statements)
references
References 56 publications
0
30
0
Order By: Relevance
“…The object/part regions in an image are determined using existing object/part detectors, while the regions are described by deep learned features. Typical methods apply Part R-CNN [8], HSnet Search [9], NTS-Net [10], Spatial Relation [11], DCL [12] and FDR [13]. Although these methods effectively improve the accuracy, the detection of regions is generally computationally expensive.…”
Section: A Regional Feature-based Methodsmentioning
confidence: 99%
“…The object/part regions in an image are determined using existing object/part detectors, while the regions are described by deep learned features. Typical methods apply Part R-CNN [8], HSnet Search [9], NTS-Net [10], Spatial Relation [11], DCL [12] and FDR [13]. Although these methods effectively improve the accuracy, the detection of regions is generally computationally expensive.…”
Section: A Regional Feature-based Methodsmentioning
confidence: 99%
“…Fine-grained visual classification essentially focuses on representing visual differences between subcategories [48], [49]. The vast majority of researchers follow either a localizationclassification manner or an end-to-end encoding fashion.…”
Section: Related Work a Fine-grained Visual Classificationmentioning
confidence: 99%
“…CUB-200-2011 [8] is the most widely-used fine-grained image classification [10,11] dataset, including 11,788 images of 200 subcategories belonging to the same basic-level coarse-grained category of "Bird". It is divided as follows: training set contains 5,994 images, and testing set contains 5,794 images.…”
Section: Collectionmentioning
confidence: 99%