2013 IEEE Conference on Computer Vision and Pattern Recognition 2013
DOI: 10.1109/cvpr.2013.321
|View full text |Cite
|
Sign up to set email alerts
|

Discriminative Segment Annotation in Weakly Labeled Video

Abstract: This paper tackles the problem of segment annotation in complex Internet videos. Given a weakly labeled video, we automatically generate spatiotemporal masks for each of the concepts with which it is labeled. This is a particularly relevant problem in the video domain, as large numbers of Internet videos are now available, tagged with the visual concepts that they contain. Given such weakly labeled videos, we focus on the problem of spatiotemporal segment classification. We propose a straightforward algorithm,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
164
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 120 publications
(164 citation statements)
references
References 29 publications
0
164
0
Order By: Relevance
“…The co-localization problem is similar to co-segmentation [21-24, 38, 39, 50] and weakly supervised localization (WSL) [13,17,31,32,[43][44][45][46]49]. In contrast to co-segmentation, we seek to localize objects with bounding boxes rather than segmentations, which allows us to greatly decrease the number of variables in our problem.…”
Section: Related Workmentioning
confidence: 99%
“…The co-localization problem is similar to co-segmentation [21-24, 38, 39, 50] and weakly supervised localization (WSL) [13,17,31,32,[43][44][45][46]49]. In contrast to co-segmentation, we seek to localize objects with bounding boxes rather than segmentations, which allows us to greatly decrease the number of variables in our problem.…”
Section: Related Workmentioning
confidence: 99%
“…The dataset is built using YouTube-Objects dataset [17] which consists of videos collected for 10 different object classes. We use this dataset because all the frames of a video have object of interest segmented [10]. Therefore, these videos can be used as ground-truth for evaluation.…”
Section: Dataset and Setupmentioning
confidence: 99%
“…We use the subset of the dataset described in Tang et al [10]. The dataset is built using YouTube-Objects dataset [17] which consists of videos collected for 10 different object classes.…”
Section: Dataset and Setupmentioning
confidence: 99%
See 1 more Smart Citation
“…Tang et al [38] proposed a method to automatically annotate discriminative objects in weakly labeled videos. Jain et al [39] represent discriminative video objects at the patch level.…”
Section: Related Workmentioning
confidence: 99%