2014 IEEE Conference on Computer Vision and Pattern Recognition 2014
DOI: 10.1109/cvpr.2014.254
|View full text |Cite
|
Sign up to set email alerts
|

Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts

Abstract: Detecting objects becomes difficult when we need to deal with large shape deformation, occlusion and low resolution. We propose a novel approach to i) handle large deformations and partial occlusions in animals (as examples of highly deformable objects), ii) describe them in terms of body parts, and iii) detect them when their body parts are hard to detect (e.g., animals depicted at low resolution). We represent the holistic object and body parts separately and use a fully connected model to arrange templates … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
622
0
1

Year Published

2017
2017
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 471 publications
(623 citation statements)
references
References 24 publications
0
622
0
1
Order By: Relevance
“…However, like the traditional 2D image processing methods, DeepLab does not incorporate consistency between slices as domain knowledge to enhance result performance. Additionally, benchmarks [25,26] have a lower resolution compared with microscopic images: the highest resolution of M. Everingham et al [25] was 500 × 486, and that of X. Chen et al [26] was 2048 × 1024. In contrast, the resolution of Figure 1b is 3200 × 3200, which is too large to train a network with limited GPU memory resources.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…However, like the traditional 2D image processing methods, DeepLab does not incorporate consistency between slices as domain knowledge to enhance result performance. Additionally, benchmarks [25,26] have a lower resolution compared with microscopic images: the highest resolution of M. Everingham et al [25] was 500 × 486, and that of X. Chen et al [26] was 2048 × 1024. In contrast, the resolution of Figure 1b is 3200 × 3200, which is too large to train a network with limited GPU memory resources.…”
Section: Introductionmentioning
confidence: 99%
“…L.C. Chen et al [24] down-sampled the images by a factor of two because the objects in X. Chen et al [26] were too large. Unfortunately, there are many finely detailed dendrite structures in Figure 1b to which the down-sample process cannot be applied.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The key advantages of exploiting semantic part representations is that parts have lower intra-class variability than whole objects, they deal better with pose variation and their configuration provides useful information about the aspect of the object. The most notable examples of works on semantic part models are fine-grained recognition (Lin et al 2015;Zhang et al 2014;Parkhi et al 2012), object class detection (Chen et al 2014), articulated pose estimation Sun and Savarese 2011;Ukita 2012), and attribute prediction (Zeiler and Fergus 2013;Vedaldi et al 2014;Gkioxari et al 2015).…”
Section: Introductionmentioning
confidence: 99%
“…Here we go a step further and perform two quantitative evaluations that examine the different stimuli of the CNN filters and try to associate them with semantic parts. First, we take advantage of the available ground-truth part location annotations in the PASCAL-Part dataset (Chen et al 2014) to count how many of the annotated semantic parts emerge in a CNN. Second, we use human judgements to determine what fraction of all filters systematically fire on any semantic part (including parts that might not be annotated in PASCAL-Part).…”
Section: Introductionmentioning
confidence: 99%