2016
DOI: 10.1007/978-3-319-46475-6_38
|View full text |Cite
|
Sign up to set email alerts
|

Title Generation for User Generated Videos

Abstract: Abstract. A great video title describes the most salient event compactly and captures the viewer's attention. In contrast, video captioning tends to generate sentences that describe the video as a whole. Although generating a video title automatically is a very useful task, it is much less addressed than video captioning. We address video title generation for the first time by proposing two methods that extend state-of-the-art video captioners to this new task. First, we make video captioners highlight sensiti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
48
0
1

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
4
2

Relationship

1
9

Authors

Journals

citations
Cited by 68 publications
(49 citation statements)
references
References 40 publications
(106 reference statements)
0
48
0
1
Order By: Relevance
“…Recently, a few video description datasets have been proposed, namely MSR-VTT (Xu et al 2016), TGIF (Li et al 2016) and VTW (Zeng et al 2016). Similar to MSVD dataset (Chen and Dolan 2011), MSR-VTT is based on YouTube clips.…”
Section: Comparison To Other Video Description Datasetsmentioning
confidence: 99%
“…Recently, a few video description datasets have been proposed, namely MSR-VTT (Xu et al 2016), TGIF (Li et al 2016) and VTW (Zeng et al 2016). Similar to MSVD dataset (Chen and Dolan 2011), MSR-VTT is based on YouTube clips.…”
Section: Comparison To Other Video Description Datasetsmentioning
confidence: 99%
“…Since labels are not required, their method can be fully unsupervised. We train their model on the four datasets they used, and also additionally expand the training with the additional dataset, called VTW [46].…”
Section: Highlightness Scorementioning
confidence: 99%
“…The second dataset, i.e., VTW, is originally proposed for the task of video captioning, which totally contains 18100 videos [34]. Fortunately, 2000 of them are labeled with subshot-level highlight scores that indicate the con dence of each subshot to be selected into the summary, so they are employed in this paper.…”
Section: Setupmentioning
confidence: 99%