Frame-to-Frame Aggregation of Active Regions in Web Videos for Weakly Supervised Semantic Segmentation

Lee, Jungbeom; Kim, Eunji; Lee, Sung-Min; Lee, Jangho; Yoon, Sungroh

doi:10.1109/iccv.2019.00691

Cited by 47 publications

(62 citation statements)

References 47 publications

(125 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Previous works on weakly-supervised semantic segmentation have used image-level annotations [17,20,26,27,42,50,52], points/clicks [4], scribbles [29,47,48,49], bounding box annotations [11,19,37,41,46,55] and adversarial training [3,21]. We take a closer look at some of these methods and categorize them based on the labels required and their methodology.…”

Section: Related Workmentioning

confidence: 99%

“…Our segmentation network architecture is similar to UPerNet [51] where the encoder backbone is ResNet-101 [16], and decoders consist of 2 convolutional layers. We employ the ResNet-101 backbone to ensure fair comparison with the three most recent SOTA works, SDI [23], Li et al [28], and BCM [46] as well as 4 other recent methods [27,47,48,49] in Table 1. We have three decoders, one each for the y, α, and β branches.…”

Section: Implementation Detailsmentioning

confidence: 99%

“…Weakly supervised approaches using bounding box annotations have been shown to be more accurate when compared to methods that use only image-level labels. SOTA segmentation result on VOC using bounding box annotations [46] outperforms the SOTA methods using image-level labels [27] by ∼ 4%. Our work, Box2Seg, builds upon previous approaches that use bounding box annotations.…”

Section: Introductionmentioning

confidence: 96%

See 2 more Smart Citations

Box2Seg: Attention Weighted Loss and Discriminative Feature Learning for Weakly Supervised Segmentation

Kulharia

Chandra

Agrawal

et al. 2020

Computer Vision – ECCV 2020

View full text Add to dashboard Cite

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Implementation Detailsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 96%

See 1 more Smart Citation

Box2Seg: Attention Weighted Loss and Discriminative Feature Learning for Weakly Supervised Segmentation

Kulharia

Chandra

Agrawal

et al. 2020

Computer Vision – ECCV 2020

View full text Add to dashboard Cite

show abstract

“…Various weak annotations have been adopted in this research field, such as bounding boxes [17][18][19], scribbles [20], points [21], and image-level labels [5,11,13]. Moreover, some research studies, e.g., [22,23], improve the performance with additional and unlabeled data. Usually, the data are obtained from the Internet called web data.…”

Section: Related Workmentioning

confidence: 99%

“…Usually, the data are obtained from the Internet called web data. So these are also called weblysupervised segmentation methods [22]. In this paper, we utilize the image-level labels which are very cheap to obtain and do not provide any localization information about the object in the image.…”

Section: Related Workmentioning

confidence: 99%

Deep graph cut network for weakly-supervised semantic segmentation

Feng

Wang

Liu

2021

Sci. China Inf. Sci.

View full text Add to dashboard Cite

The scarcity of fully-annotated data becomes the biggest obstacle that prevents many deep learning approaches from widely applied. Weakly-supervised visual learning which can utilize inexact annotations is developed rapidly to remedy such a situation. In this paper, we study the weakly-supervised task achieving pixel-level semantic segmentation only with image-level labels as supervision. Different from other methods, our approach tries to transform the weakly-supervised visual learning problem into a semi-supervised visual learning problem and then utilizes semi-supervised learning methods to solve it. Utilizing this transformation, we can adopt effective semi-supervised methods to perform transductive learning with context information. In the semi-supervised learning module, we propose to use the graph cut algorithm to label more supervision from the activation seeds generated from a classification network. The generated labels can provide the segmentation model with effective supervision information; moreover, the graph cut module can benefit from features extracted by the segmentation model. Then, each of them updates and optimizes the other iteratively until convergence. Experiment results on PASCAL VOC and COCO benchmarks demonstrate the effectiveness of the proposed deep graph cut algorithm for weakly-supervised semantic segmentation.

show abstract

Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation

Sun

Yang

Dai

et al. 2020

Lecture Notes in Computer Science

217

153

View full text Add to dashboard Cite

This paper studies the problem of learning semantic segmentation from image-level supervision only. Current popular solutions leverage object localization maps from classifiers as supervision signals, and struggle to make the localization maps capture more complete object content. Rather than previous efforts that primarily focus on intra-image information, we address the value of cross-image semantic relations for comprehensive object pattern mining. To achieve this, two neural coattentions are incorporated into the classifier to complimentarily capture cross-image semantic similarities and differences. In particular, given a pair of training images, one co-attention enforces the classifier to recognize the common semantics from co-attentive objects, while the other one, called contrastive co-attention, drives the classifier to identify the unshared semantics from the rest, uncommon objects. This helps the classifier discover more object patterns and better ground semantics in image regions. In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference, hence eventually benefiting semantic segmentation learning. More essentially, our algorithm provides a unified framework that handles well different WSSS settings, i.e., learning WSSS with (1) precise image-level supervision only, (2) extra simple single-label data, and (3) extra noisy web data. It sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability.

show abstract

Frame-to-Frame Aggregation of Active Regions in Web Videos for Weakly Supervised Semantic Segmentation

Cited by 47 publications

References 47 publications

Box2Seg: Attention Weighted Loss and Discriminative Feature Learning for Weakly Supervised Segmentation

Box2Seg: Attention Weighted Loss and Discriminative Feature Learning for Weakly Supervised Segmentation

Deep graph cut network for weakly-supervised semantic segmentation

Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation

Contact Info

Product

Resources

About