Survey of Compressed Domain Video Summarization Techniques

Basavarajaiah, Madhushree; Sharma, Priyanka

doi:10.1145/3355398

Cited by 42 publications

(17 citation statements)

References 116 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The key to this approach is the vectorization of content because it matches based on the similarities in vectorized content. Examples of vectorization methods include video and description summarization [31,32] and image captioning [33][34][35]. In particular, the authors of [36] matched poetry and images through captioning, the authors of [33] presented an automatic caption generation method for Impressionist artworks for people with visual impairments, and the authors of [28] used emotional features in music recommendations.…”

Section: Sensing Technologymentioning

confidence: 99%

See 1 more Smart Citation

Construction of a Soundscape-Based Media Art Exhibition to Improve User Appreciation Experience by Using Deep Neural Networks

et al. 2021

View full text Add to dashboard Cite

The objective of this study was to improve user experience when appreciating visual artworks with soundscape music chosen by a deep neural network based on weakly supervised learning. We also propose a multi-faceted approach to measuring ambiguous concepts, such as the subjective fitness, implicit senses, immersion, and availability. We showed improvements in appreciation experience, such as the metaphorical and psychological transferability, time distortion, and cognitive absorption, with in-depth experiments involving 70 participants. Our test results were similar to those of “Bunker de Lumières: van Gogh”, which is an immersive media artwork directed by Gianfranco lannuzzi; the fitness scores of our system and “Bunker de Lumières: van Gogh” were 3.68/5 and 3.81/5, respectively. Moreover, the concordance of implicit senses between artworks and classical music was measured to be 0.88%, and the time distortion and cognitive absorption improved during the immersion. Finally, the proposed method obtained a subjective satisfaction score of 3.53/5 in the evaluation of its usability. Our proposed method can also help spread soundscape-based media art by supporting traditional soundscape design. Furthermore, we hope that our proposed method will help people with visual impairments to appreciate artworks through its application to a multi-modal media art guide platform.

show abstract

Section: Sensing Technologymentioning

confidence: 99%

“…The hyper-parameters of training were obtained via a greedy search. The ranges of the greedy searches were as follows: The batch sizes were [8,16,32], and the initial learning rates were [1 × 10 −4 , 2 × 10 −4 , . .…”

Section: Audio Feature Extraction Via the Multi-time-scale Transformmentioning

confidence: 99%

Construction of a Soundscape-Based Media Art Exhibition to Improve User Appreciation Experience by Using Deep Neural Networks

et al. 2021

View full text Add to dashboard Cite

show abstract

“…Kind of text summary (KTS): The most widely discussed distinction for text summarization works is the distinction of extractive vs abstractive. Similarly, depending on the nature of an output text summary, we can also classify the works in MMS tasks (containing text in the output) into extractive MMS [13,[44][45][46]58] and abstractive MMS [12,57,133,134] 6 .…”

Section: Content Intensity (Ci)mentioning

confidence: 99%

“…Although quite a few survey papers were written for uni-modal summarization tasks including surveys on text summarization [31,32,81,112,124] and video summarization [6,41,52,76,102], and a few survey papers covering multi-modal research [3,4,43,90,103,107]. However, to the best of our knowledge, we are the first to present a survey on multi-modal summarization.…”

Section: Introductionmentioning

confidence: 99%

A Survey on Multi-modal Summarization

Jangra¹,

Mukherjee²,

Jatowt³

et al. 2021

Preprint

View full text Add to dashboard Cite

The new era of technology has brought us to the point where it is convenient for people to share their opinions over an abundance of platforms. These platforms have a provision for the users to express themselves in multiple forms of representations, including text, images, videos, and audio. This, however, makes it difficult for users to obtain all the key information about a topic, making the task of automatic multi-modal summarization (MMS) essential. In this paper, we present a comprehensive survey of the existing research in the area of MMS.

show abstract

“…The more recent study of Molino et al [8] focuses on egocentric video summarization and discusses the specifications and the challenges of this task. In another recent work, Basavarajaiah et al [9] provide a classification of various summarization approaches, including some recently-proposed deep-learning-based methods; however, it mainly focuses on summarization algorithms that are directly applicable on the compressed domain. Finally, another recent survey of Vivekraj et al [10] presents the relevant bibliography based on a twoway categorization, that relates to the utilized data modalities during the analysis and the incorporation of human aspects.…”

Section: Introductionmentioning

confidence: 99%

Video Summarization Using Deep Neural Networks: A Survey

Apostolidis¹,

Adamantidou²,

Metsai³

et al. 2021

Preprint

View full text Add to dashboard Cite

Video summarization technologies aim to create a concise and complete synopsis by selecting the most informative parts of the video content. Several approaches have been developed over the last couple of decades and the current state of the art is represented by methods that rely on modern deep neural network architectures. This work focuses on the recent advances in the area and provides a comprehensive survey of the existing deep-learning-based methods for generic video summarization. After presenting the motivation behind the development of technologies for video summarization, we formulate the video summarization task and discuss the main characteristics of a typical deep-learning-based analysis pipeline. Then, we suggest a taxonomy of the existing algorithms and provide a systematic review of the relevant literature that shows the evolution of the deep-learning-based video summarization technologies and leads to suggestions for future developments. We then report on protocols for the objective evaluation of video summarization algorithms and we compare the performance of several deep-learning-based approaches. Based on the outcomes of these comparisons, as well as some documented considerations about the suitability of evaluation protocols, we indicate potential future research directions.

show abstract

Survey of Compressed Domain Video Summarization Techniques

Cited by 42 publications

References 116 publications

Construction of a Soundscape-Based Media Art Exhibition to Improve User Appreciation Experience by Using Deep Neural Networks

Construction of a Soundscape-Based Media Art Exhibition to Improve User Appreciation Experience by Using Deep Neural Networks

A Survey on Multi-modal Summarization

Video Summarization Using Deep Neural Networks: A Survey

Contact Info

Product

Resources

About