2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00135
|View full text |Cite
|
Sign up to set email alerts
|

Less Is More: Learning Highlight Detection From Video Duration

Abstract: Highlight detection has the potential to significantly ease video browsing, but existing methods often suffer from expensive supervision requirements, where human viewers must manually identify highlights in training videos. We propose a scalable unsupervised solution that exploits video duration as an implicit supervision signal. Our key insight is that video segments from shorter user-generated videos are more likely to be highlights than those from longer videos, since users tend to be more selective about … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
65
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 87 publications
(68 citation statements)
references
References 34 publications
(117 reference statements)
0
65
0
Order By: Relevance
“…An AVAC could automatically keep a running tally of information the spectator may find interesting based on their reaction and the current state of play. To enhance the spectator experience, the AVAC may automatically generate highlight reels that effectively reflect the flow of the game or summarize the most exciting segments (Mahasseni, Lam, & Todorovic, 2017;Merler et al, 2018;Xiong, Kalantidis, Ghadiyaram, & Grauman, 2019;Yang et al, 2015;K. Zhang, Chao, Sha, & Grauman, 2016); moreover, the AVAC might suggest related games predicted to engage or interest the spectator.…”
Section: Crowd-sourced Datamentioning
confidence: 99%
“…An AVAC could automatically keep a running tally of information the spectator may find interesting based on their reaction and the current state of play. To enhance the spectator experience, the AVAC may automatically generate highlight reels that effectively reflect the flow of the game or summarize the most exciting segments (Mahasseni, Lam, & Todorovic, 2017;Merler et al, 2018;Xiong, Kalantidis, Ghadiyaram, & Grauman, 2019;Yang et al, 2015;K. Zhang, Chao, Sha, & Grauman, 2016); moreover, the AVAC might suggest related games predicted to engage or interest the spectator.…”
Section: Crowd-sourced Datamentioning
confidence: 99%
“…As for domain-agnostic approach, Mendi et al propose motion strength (Mendi, Clemente, and Bayrak 2013) that operates uniformly on any video. Domain-specific approaches tailor highlights to the topic domain, and leverage video duration (Xiong et al 2019) and visual co-occurrence (Chu, Song, and Jaimes 2015) as the weak supervision signal, or leverage category-aware reconstruction loss (Yang et al 2015a). However, without humanguided signals, the results are not satisfying enough.…”
Section: Related Work Video Highlight Detectionmentioning
confidence: 99%
“…Video highlight detection algorithms are generally categorized as either unsupervised or supervised methods. Unsupervised techniques create video highlights by employing heuristics, such as video duration (Xiong et al 2019) and visual co-occurrence (Chu, Song, and Jaimes 2015), to achieve desired characteristics. Without human-guided signals, however, the results are not satisfying enough.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…As another example, when it comes to semantics at a higher level than what is captured by visual appearance, closeup of a player in soccer can be considered important if it is immediately followed by a goal, while not so important when it occurs elsewhere. These and other issues discussed below make it an interesting research problem with several papers pushing the state-of-the-art for newer algorithms and model architectures [6,10,19,32,34,[36][37][38]40] and datasets [8,26,29]. However, as noted by a few recent works several fundamental issues remain to be addressed.…”
Section: Introductionmentioning
confidence: 99%