2019
DOI: 10.1016/j.knosys.2019.104943
|View full text |Cite
|
Sign up to set email alerts
|

Posterior-thresholding feature extraction for paralinguistic speech classification

Abstract: The standard approach for handling computational paralinguistic speech tasks is to extract several thousand utterance-level features from the speech excerpts, and use machine learning methods such as Support Vector Machines and Deep Neural Networks (DNNs) for the actual classification task. In contrast, Automatic Speech Recognition handles the speech signal in small, equal-sized parts called frames. Although the speech community has developed techniques for efficient frame classification, these efforts have mo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
5
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(10 citation statements)
references
References 42 publications
(64 reference statements)
0
5
0
Order By: Relevance
“…From the perspective of the speech recognition model, the application of speech signal can be roughly divided into three categories, including vocal print recognition, speech recognition, and emotion recognition [8]. The classifiers for speech recognition tasks include traditional classifiers and deep learning algorithms, involving HMM, Gaussian Mixture Model (GMM), support vector machine (SVM), and extreme learning machine (ELM) [9][10][11].…”
Section: Introductionmentioning
confidence: 99%
“…From the perspective of the speech recognition model, the application of speech signal can be roughly divided into three categories, including vocal print recognition, speech recognition, and emotion recognition [8]. The classifiers for speech recognition tasks include traditional classifiers and deep learning algorithms, involving HMM, Gaussian Mixture Model (GMM), support vector machine (SVM), and extreme learning machine (ELM) [9][10][11].…”
Section: Introductionmentioning
confidence: 99%
“…Human emotions can be automatically detected through these smarter systems by using the electroencephalography (EEG) signals, facial expression, gesture recognition, and speech signals. One of the more effective uses of speech signals is speech emotion recognition (SER), which is used to determine the emotional state of his/her speech 1–3 . Speech signals show the physiological expression that conveys the message or information about human emotions and provide an efficient communication platform among the human–computer interaction (HCI) 4 .…”
Section: Introductionmentioning
confidence: 99%
“…Digital speech processing aims at using digital computing technology to process speech signals for better understanding and increased efficiency of interaction and productive connection with speech activities. Let's look at the history of the various technologies in recent decades for speech recognition [11][12][13][14].…”
Section: Introductionmentioning
confidence: 99%
“…A conversation is being prepared to become an evolving technology that enables individuals to communicate with machines. From this point forward, it was possible to develop speech recognition software [12][13][14][15][16][17]. In 1922, the major production began.…”
Section: Introductionmentioning
confidence: 99%