2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2011
DOI: 10.1109/icassp.2011.5946961
|View full text |Cite
|
Sign up to set email alerts
|

A supervised approach to movie emotion tracking

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

5
61
0
1

Year Published

2012
2012
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 81 publications
(67 citation statements)
references
References 11 publications
5
61
0
1
Order By: Relevance
“…Cui et al [5] address affective content analysis of music videos, where they employ audio-visual features for the construction of arousal and valence models. Intended emotion tracking of movies is a subject addressed by Malandrakis et al [13], where audio-visual features are extracted for the affective representation of movies. In [19], a combined analysis of low-level audio-and visual representations based on early feature fusion is presented for the facial emotion recognition in videos.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Cui et al [5] address affective content analysis of music videos, where they employ audio-visual features for the construction of arousal and valence models. Intended emotion tracking of movies is a subject addressed by Malandrakis et al [13], where audio-visual features are extracted for the affective representation of movies. In [19], a combined analysis of low-level audio-and visual representations based on early feature fusion is presented for the facial emotion recognition in videos.…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, one key issue in designing video affective content analysis algorithms is the representation of video content as in any pattern recognition task. The common approach for video content representation is either to use low-level audio-visual features or to build hand-crafted higher level representations based on the low-level ones (e.g., [5,8,13,21]). Low-level features have the disadvantage of losing global relations or structure in data, whereas creating hand-crafted higher level representations is time consuming, problem-dependent, and requires domain knowledge.…”
Section: Introductionmentioning
confidence: 99%
“…There has been little prior work towards emotion recognition using both audio and visual cues in multimedia contents [16,18,19]. The authors in [16] performed continuous-scale emotion tracking in movies, fusing features from audio, music, and video modalities.…”
Section: Related Workmentioning
confidence: 99%
“…(c) Affective information: both intended emotions and experienced emotions have been annotated. More details on the affective annotation and the associated emotion tracking task are provided in [73].…”
Section: Databasementioning
confidence: 99%