Weakly Supervised Localization Using Deep Feature Maps

Bency, Archith J.; Kwon, Heesung; Lee, Hyungtae; Karthikeyan, S.; Manjunath, B. S.

doi:10.1007/978-3-319-46448-0_43

Cited by 63 publications

(60 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In spite of its simple and multipurpose architecture, our model outperforms by a large margin the complex cascaded architecture of ProNet [58]. It also outperforms the recent weakly supervised model [5] by 3.2 pt (resp. 4.2 pt) on VOC 2012 (resp.…”

Section: Weakly Supervised Pointwise Localizationmentioning

confidence: 79%

“…Concerning the WSL localization task, [5] uses label co-occurrence information and a coarse-to-fine strategy based on deep feature maps to predict object locations. ProNet [58] uses a cascade of two networks: the first generates bounding boxes and the second classifies them.…”

Section: Related Workmentioning

confidence: 99%

“…For weakly supervised pointwise object detection, we extract the region (i.e. neuron in the feature map) with maximum score for each class and use it for point-wise localization, as it is done in [44,5]. For weakly supervised semantic segmentation we compute the final segmentation mask either by taking the class with maximum score at each spatial position independently or by applying a CRF for spatial prediction as is common practice [8,48].…”

Section: Wildcat Applicationsmentioning

confidence: 99%

See 2 more Smart Citations

WILDCAT: Weakly Supervised Learning of Deep ConvNets for Image Classification, Pointwise Localization and Segmentation

Durand

Mordan

Thome

et al. 2017

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

313

285

View full text Add to dashboard Cite

This paper introduces WILDCAT, a deep learning method which jointly aims at aligning image regions for gaining spatial invariance and learning strongly localized features. Our model is trained using only global image labels and is devoted to three main visual recognition tasks: image classification, weakly supervised pointwise object localization and semantic segmentation. WILDCAT extends state-of-the-art Convolutional Neural Networks at three major levels: the use of Fully Convolutional Networks for maintaining spatial resolution, the explicit design in the network of local features related to different class modalities, and a new way to pool these features to provide a global image prediction required for weakly supervised training. Extensive experiments show that our model significantly outperforms the state-of-the-art methods.

show abstract

Section: Weakly Supervised Pointwise Localizationmentioning

confidence: 79%

Section: Related Workmentioning

confidence: 99%

Section: Wildcat Applicationsmentioning

confidence: 99%

See 1 more Smart Citation

WILDCAT: Weakly Supervised Learning of Deep ConvNets for Image Classification, Pointwise Localization and Segmentation

Durand

Mordan

Thome

et al. 2017

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

313

285

View full text Add to dashboard Cite

show abstract

“…2. The first module estimates image difficulty automatically via a backbone network [18] trained with only image-level labels. The second module progressively adds samples to network training in an ascending order based on image difficulty.…”

Section: Methodsmentioning

confidence: 99%

“…Once we obtain the image difficulty, the remaining task is to mine object instances from the images. A natural way is to directly choose the top scored region as the target object, which is used for localization evaluation in [18]. However, since the whole network is trained with classification loss, which makes high scored regions tend to focus on object parts rather than the whole objects.…”

Section: Estimating Image Difficultymentioning

confidence: 99%

Zigzag Learning for Weakly Supervised Object Detection

Zhang

Feng

Xiong

2018

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition

130

View full text Add to dashboard Cite

This paper addresses weakly supervised object detection with only image-level supervision at training stage. Previous approaches train detection models with entire images all at once, making the models prone to being trapped in sub-optimums due to the introduced false positive examples. Unlike them, we propose a zigzag learning strategy to simultaneously discover reliable object instances and prevent the model from overfitting initial seeds. Towards this goal, we first develop a criterion named mean Energy Accumulation Scores (mEAS) to automatically measure and rank localization difficulty of an image containing the target object, and accordingly learn the detector progressively by feeding examples with increasing difficulty. In this way, the model can be well prepared by training on easy examples for learning from more difficult ones and thus gain a stronger detection ability more efficiently. Furthermore, we introduce a novel masking regularization strategy over the high level convolutional feature maps to avoid overfitting initial samples. These two modules formulate a zigzag learning process, where progressive learning endeavors to discover reliable object instances, and masking regularization increases the difficulty of finding object instances properly. We achieve 47.6% mAP on PASCAL VOC 2007, surpassing the state-of-the-arts by a large margin.

show abstract

InfoMask: Masked Variational Latent Representation to Localize Chest Disease

Taghanaki

Havaei²,

Berthier³

et al. 2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

The scarcity of richly annotated medical images is limiting supervised deep learning based solutions to medical image analysis tasks, such as localizing discriminatory radiomic disease signatures. Therefore, it is desirable to leverage unsupervised and weakly supervised models. Most recent weakly supervised localization methods apply attention maps or region proposals in a multiple instance learning formulation. While attention maps can be noisy, leading to erroneously highlighted regions, it is not simple to decide on an optimal window/bag size for multiple instance learning approaches. In this paper, we propose a learned spatial masking mechanism to filter out irrelevant background signals from attention maps. The proposed method minimizes mutual information between a masked variational representation and the input while maximizing the information between the masked representation and class labels. This results in more accurate localization of discriminatory regions. We tested the proposed model on the ChestX-ray8 dataset to localize pneumonia from chest X-ray images without using any pixellevel or bounding-box annotations.

show abstract

Weakly Supervised Localization Using Deep Feature Maps

Cited by 63 publications

References 45 publications

WILDCAT: Weakly Supervised Learning of Deep ConvNets for Image Classification, Pointwise Localization and Segmentation

WILDCAT: Weakly Supervised Learning of Deep ConvNets for Image Classification, Pointwise Localization and Segmentation

Zigzag Learning for Weakly Supervised Object Detection

InfoMask: Masked Variational Latent Representation to Localize Chest Disease

Contact Info

Product

Resources

About