2019
DOI: 10.1007/978-3-030-37731-1_40
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Video Summarization via Attention-Driven Adversarial Learning

Abstract: This paper presents a new video summarization approach that integrates an attention mechanism to identify the significant parts of the video, and is trained unsupervisingly via generative adversarial learning. Starting from the SUM-GAN model, we first develop an improved version of it (called SUM-GAN-sl) that has a significantly reduced number of learned parameters, performs incremental training of the model's components, and applies a stepwise label-based strategy for updating the adversarial part. Subsequent… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
72
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
1

Relationship

3
4

Authors

Journals

citations
Cited by 60 publications
(73 citation statements)
references
References 29 publications
1
72
0
Order By: Relevance
“…Most of them utilize Long Short-Term Memory (LSTM) units [3] to learn how to assess the importance of each video frame. However, experimentation with some of these methods (dppLSTM [4], DR-DSN [5], SUM-GAN-sl [6], SUM-GAN-AAE [7]) resulted in findings that are consistent with the claims in [8] about the low variation of the computed frame-level importance scores by LSTMs. As a consequence, the selections made by the trained LSTM seem to have a limited impact in summarization; the latter is mainly affected by factors such as the video fragmentation, or the approach used for fragment selection given a target summary length (such as the Knapsack algorithm).…”
Section: Introductionmentioning
confidence: 72%
“…Most of them utilize Long Short-Term Memory (LSTM) units [3] to learn how to assess the importance of each video frame. However, experimentation with some of these methods (dppLSTM [4], DR-DSN [5], SUM-GAN-sl [6], SUM-GAN-AAE [7]) resulted in findings that are consistent with the claims in [8] about the low variation of the computed frame-level importance scores by LSTMs. As a consequence, the selections made by the trained LSTM seem to have a limited impact in summarization; the latter is mainly affected by factors such as the video fragmentation, or the approach used for fragment selection given a target summary length (such as the Knapsack algorithm).…”
Section: Introductionmentioning
confidence: 72%
“…The utilized evaluation protocols in these methods targeted the assessment of the created keyframe-based summaries. A typical approach, applied in [8], involved independent users that assess both the relevance of each individual keyframe (using a [1][2][3][4][5] scale) and the quality of the entire summary w.r.t. redundant or missing information.…”
Section: Relevant Literature 21 Evaluating Video Storyboardsmentioning
confidence: 99%
“…Our aim is to assess the representativeness of results when the evaluation is based on a small set of randomly-created splits and the reliability of performance comparisons that use different data splits for each algorithm. In this context we evaluate five publicly-available video summarization algorithms (two supervised: dppLSTM [41], VASNet [14]; and three unsupervised: DR-DSN [46], SUM-GANsl [3], SUM-GAN-AAE [2]) using the established protocol and a fixed set of 5 randomly-generated data splits of the SumMe and TVSum datasets (that simulates the evaluation conditions of most SoA works). These methods are, to our knowledge, the only ones Table 1: Comparison (F-Score (%)) of five publicly-available video summarization approaches in SumMe and TVSum datasets, using 5 and 50 randomly-generated splits.…”
Section: A Study On the Established Evaluation Protocolmentioning
confidence: 99%
See 2 more Smart Citations