2021
DOI: 10.48550/arxiv.2112.00580
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Background Activation Suppression for Weakly Supervised Object Localization

Abstract: Weakly supervised object localization (WSOL) aims to localize the object region using only image-level labels as supervision. Recently a new paradigm has emerged by generating a foreground prediction map (FPM) to achieve the localization task. Existing FPM-based methods use crossentropy (CE) to evaluate the foreground prediction map and to guide the learning of generator. We argue for using activation value to achieve more efficient learning. It is based on the experimental observation that, for a trained netw… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(7 citation statements)
references
References 29 publications
0
7
0
Order By: Relevance
“…At first, our method may not achieve satisfactory results in complicated scenes containing many objects. To improve our method, we could adopt a locate-then-segment framework to locate objects [57] then generate the mask. Secondly, Our approach aims at detecting all possible objects in the image and cannot detect the one that best fits the intention.…”
Section: Conclusion and Discussionmentioning
confidence: 99%
“…At first, our method may not achieve satisfactory results in complicated scenes containing many objects. To improve our method, we could adopt a locate-then-segment framework to locate objects [57] then generate the mask. Secondly, Our approach aims at detecting all possible objects in the image and cannot detect the one that best fits the intention.…”
Section: Conclusion and Discussionmentioning
confidence: 99%
“…Metrics. Following [29,19,43], for localization, we utilize GT-known localization accuracy (GT-known Loc), Top-1/Top5 localization accuracy (Top-1/Top-5 Loc), and maximal box accuracy (MaxBoxAccV2) [5] as evaluation metrics. GT-known Loc is correct indicating that the intersection over union (IoU) of the predicted bounding box and the ground-truth bounding box is 50% or more.…”
Section: Methodsmentioning
confidence: 99%
“…In the training phase, the input images are resized to 256×256 and then randomly cropped to 224×224. In the inference phase, following [29,9,36], we adopt ten crop augmentations to obtain classification results and replace random crop with center crop for localization.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Weakly Supervised Localization (WSOL) Class Activation Map (CAM) explainability methods have been offered in recent years for solving WSOL tasks [87,88,56,43]. Most of these algorithms train a classifier to distinguish between sub-categories of the main object (Birds, Cars, Dog etc), employing a localization loss term for the explainability map [76,78,48,28,52].…”
Section: Related Workmentioning
confidence: 99%