2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
DOI: 10.1109/cvpr.2017.631
|View full text |Cite
|
Sign up to set email alerts
|

WILDCAT: Weakly Supervised Learning of Deep ConvNets for Image Classification, Pointwise Localization and Segmentation

Abstract: This paper introduces WILDCAT, a deep learning method which jointly aims at aligning image regions for gaining spatial invariance and learning strongly localized features. Our model is trained using only global image labels and is devoted to three main visual recognition tasks: image classification, weakly supervised pointwise object localization and semantic segmentation. WILDCAT extends state-of-the-art Convolutional Neural Networks at three major levels: the use of Fully Convolutional Networks for maintaini… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
285
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 304 publications
(287 citation statements)
references
References 57 publications
2
285
0
Order By: Relevance
“…whereM ck is the instance-wise CAMs of class c and instance k. Each instance-wise CAM is refined individually by propagating its attention scores to relevant areas. Specifically, the propagation is done by random walk, whose transition probability matrix is derived from the semantic affinity matrix A = [a ij ] ∈ R wh×wh as follows: (12) and A •β is A to the Hadamard power of β and S is a diagonal matrix for row-normalization of A •β . Also, β > 1 is a hyper-parameter for smoothing out affinity values in A.…”
Section: Synthesizing Instance Segmentation Labelsmentioning
confidence: 99%
“…whereM ck is the instance-wise CAMs of class c and instance k. Each instance-wise CAM is refined individually by propagating its attention scores to relevant areas. Specifically, the propagation is done by random walk, whose transition probability matrix is derived from the semantic affinity matrix A = [a ij ] ∈ R wh×wh as follows: (12) and A •β is A to the Hadamard power of β and S is a diagonal matrix for row-normalization of A •β . Also, β > 1 is a hyper-parameter for smoothing out affinity values in A.…”
Section: Synthesizing Instance Segmentation Labelsmentioning
confidence: 99%
“…An aggregation function f agg (A c ) : R h,w → [0, 1] is designed to map the SM for each class c into a predictionŷ c loc . The design of f agg (A c ) has been extensively studied [4]. Global Average Pooling (GAP) would dilute the prediction as most of the spatial locations in A c correspond to background and provide little training signal.…”
Section: Trainingmentioning
confidence: 99%
“…This study (Kolesnikov & Lampert, 2016) proposes a new composite loss function to train FCNs directly from image-level labels. Another study (Durand, Mordan, Thome, & Cord, 2017) proposes a two-step approach: first, teach a CNN classification model trained on imagelevel labels to learn good representations and, then, use the learned feature maps to get the segmentation result.…”
Section: Models For Weakly Labeled Datamentioning
confidence: 99%