Proceedings of the 22nd ACM International Conference on Multimedia 2014
DOI: 10.1145/2647868.2654904
|View full text |Cite
|
Sign up to set email alerts
|

Music Emotion Recognition by Multi-label Multi-layer Multi-instance Multi-view Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 62 publications
(25 citation statements)
references
References 15 publications
0
25
0
Order By: Relevance
“…Most related to our work are the papers [38]- [45] that proposed the application of MIL for capturing the time ambiguity of pain [38]- [42], affective music response [43], behavioural expressions [44] and vocal interaction [45]. As discussed below, the main differences with our work lie in the (i) multiple instance algorithms we propose, (ii) the nature of the employed predictors (e.g.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Most related to our work are the papers [38]- [45] that proposed the application of MIL for capturing the time ambiguity of pain [38]- [42], affective music response [43], behavioural expressions [44] and vocal interaction [45]. As discussed below, the main differences with our work lie in the (i) multiple instance algorithms we propose, (ii) the nature of the employed predictors (e.g.…”
Section: Related Workmentioning
confidence: 99%
“…MIL was used in [43] to automatically recognise the affective content of a piece of music using a generative approach based on a hierarchical Bayesian model. Each song is associated with a bag, and the temporal audio segments form the corresponding instances.…”
Section: Related Workmentioning
confidence: 99%
“…We then employ Librosa 9 to extract widely-used acoustic features, such as Mel-Frequency Cepstral Coefficients (MFCC) [46], Zero Crossing Rate [47], etc. Finally, we obtain a 512 dimensional feature vector from each audio clip.…”
Section: B Features In Acoustic Modalitymentioning
confidence: 99%
“…Each of us tried to describe the excerpt in a single one-word adjective, and the 14 words we used most frequently were selected as the 14 categories. It turns out that about half of the 14 words were used in our previous related studies [79][80][81][82][83][84][85][86][87][88][89][90][91][92][93][94][95][96] and most of the others were used in studies by other researchers [9,20,21,35,47]. All the categories included in the 4-quadrant model in Figure 1 appear in Figure 4 except Angry.…”
Section: Second Test: Best Word From 14 Categoriesmentioning
confidence: 99%