2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) 2017
DOI: 10.1109/fskd.2017.8393364
|View full text |Cite
|
Sign up to set email alerts
|

Synchronous prediction of arousal and valence using LSTM network for affective video content analysis

Abstract: The affect embedded in video data conveys high-level semantic information about the content and has direct impact on the understanding and perception of reviewers, as well as their emotional responses. Affective Video Content Analysis (AVCA) attempts to generate a direct mapping between video content and the corresponding affective states such as arousal and valence dimensions. Most existing studies establish the mapping for each dimension separately using knowledge-based rules or traditional classifiers such … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(12 citation statements)
references
References 17 publications
0
11
0
Order By: Relevance
“…Note that our proposed method provides better performance than the methods of Baveye et al [20] and Gan et al [21], even though we use only visual information. Also, our method shows comparable or better performance with the method of Zhang and Zhang [19] where it uses carefully designed and selected handcrafted features. The results show the proposed model is promising in the emotional evaluation of video clips.…”
Section: Resultsmentioning
confidence: 79%
See 1 more Smart Citation
“…Note that our proposed method provides better performance than the methods of Baveye et al [20] and Gan et al [21], even though we use only visual information. Also, our method shows comparable or better performance with the method of Zhang and Zhang [19] where it uses carefully designed and selected handcrafted features. The results show the proposed model is promising in the emotional evaluation of video clips.…”
Section: Resultsmentioning
confidence: 79%
“…In recent study, LSTM has been used to estimate the emotion of a video clip in the same Thayer's emotion space. But the hand-crafted audio and video features were extracted, then only the selected features were exploited to estimate the degrees of arousal and valence in the LSTM [19]. In the work, the idea of LSTM is similar to ours, in that it takes a role to characterize a long-term dynamic behavior of video clips.…”
Section: Lstm With Mlp-type Regression Networkmentioning
confidence: 99%
“…The study [8] represents 27 distinct possible categories of human emotion but in case of music video, it is convenient to organize them with coarse semantic groups so that an end-user can easily demand the required music video from large video banks or online music video stores. We categorize the adjectives of music video emotion classification into six basic emotion categories with references [41,52,67], namely, Exciting, Fear, Neutral, Relaxation, Sad, and Tension. From each emotion class, respectively three samples are represented (from left to right) in Fig.…”
Section: Music Video Emotion Datasetmentioning
confidence: 99%
“…Images usually contain textual descriptions such as street names, road signs, building numbers and product descriptions, which often provide key clues for information perception. Thus, scene text understanding in natural image is extremely useful for these fields, such as the direct perception for autonomous driving [7], the image caption for image retrieval [8, 9], the text recognition for automatic translation [10, 11], the text location and recognition for video content analysis [12, 13] etc.…”
Section: Introductionmentioning
confidence: 99%