The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2019
DOI: 10.1109/tcyb.2017.2787717
|View full text |Cite
|
Sign up to set email alerts
|

Multiscale Amplitude Feature and Significance of Enhanced Vocal Tract Information for Emotion Classification

Abstract: In this paper, a novel multiscale amplitude feature is proposed using multiresolution analysis (MRA) and the significance of the vocal tract is investigated for emotion classification from the speech signal. MRA decomposes the speech signal into number of sub-band signals. The proposed feature is computed by using sinusoidal model on each sub-band signal. Different emotions have different impacts on the vocal tract. As a result, vocal tract responds in a unique way for each emotion. The vocal tract information… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 46 publications
(18 citation statements)
references
References 52 publications
0
18
0
Order By: Relevance
“…Current methods of emotion recognition mainly involve facial expression recognition [3][4][5][6], speech emotion recognition [7][8][9], gesture expression recognition [10], text recognition [11], physiological pattern recognition, and multimodal emotion recognition [12][13][14][15]. In practical applications, the non-contact method of extracting physiological parameters for face imaging has attracted special attention.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Current methods of emotion recognition mainly involve facial expression recognition [3][4][5][6], speech emotion recognition [7][8][9], gesture expression recognition [10], text recognition [11], physiological pattern recognition, and multimodal emotion recognition [12][13][14][15]. In practical applications, the non-contact method of extracting physiological parameters for face imaging has attracted special attention.…”
Section: Literature Reviewmentioning
confidence: 99%
“…It is evident from the literature that the combination of speech features, i.e. feature fusion, increases the classification accuracy of the SER system [6,23,28] and hence became the most common practice in this field.…”
Section: Continuous Featuresmentioning
confidence: 99%
“…Mel-frequency cepstral coefficients (MFCCs) [11,21,22], linear prediction coefficients (LPCs) [23], relative spectral perceptual linear prediction (RASTA-PLP) [16], and variants of these features like modified MFCC (M-MFCC) [13], feature fusion of MFCC, and short-time energy features with velocity ( ∆) and acceleration ( ∆+∆ ) [23] are some of the well-known spectral features that are used for speech emotion recognition. Apart from these, log frequency power coefficients (LFPCs) [24], Fourier parameter features [25], time-frequency features with AMS-GMM mask [26], modulation spectral features [27], and amplitude-based features [28] are some of the variants of spectral features that are now used in SER analysis.…”
Section: Continuous Featuresmentioning
confidence: 99%
“…Multiscale amplitude feature (abbreviate as Mul‐Amp) is a latest provided multi‐resolution feature on time domain in 2018 14 . The multi‐resolution is achieved by wavelet package transformation and subband partition.…”
Section: Experiments and Evaluationmentioning
confidence: 99%
“…Deb and Dandapat extracted a subband amplitude feature by decomposed speech signal into multi‐scale frequency bands and Fourier transform. This feature has a good distinguishing performance on experiments 14 . However, this feature used a uniform partition method to frequency bands, cannot embody the requirement of non‐linear in psychoacoustic model.…”
Section: Introductionmentioning
confidence: 99%