A Survey on Techniques for Enhancing Speech

Tayseer, M; Adeel, Ahsan; Hussain, Amir

doi:10.5120/ijca2018916290

Cited by 12 publications

(5 citation statements)

References 47 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It also involves the computation of short-time Fourier transform (STFT). The technique minimizes the MSE between the approximated signal magnitude spectrum D^(w) and the original signal magnitude spectrum D(w) [37], [38]. The sample signal wave plots for anger emotion can be visualized from the following graphs.…”

Section: Pre-processingmentioning

confidence: 99%

Impact of Feature Extraction and Feature Selection Algorithms on Punjabi Speech Emotion Recognition Using Convolutional Neural Network

Kaur

Singh

2022

ACM Trans. Asian Low-Resour. Lang. Inf. Process.

View full text Add to dashboard Cite

The challenge to refine the spontaneity and productivity of a machine and human coherence, speech emotion recognition has been an overriding area of research. The trustability and fulfillment of such emotion recognition are largely involved with the feature extraction and selection processes. An important role is played in exploring and distinguishing audio content during the feature extraction phase. Also, the features that have been extracted should be tough to a number of disturbances and reliable enough for an adequate classification system. This paper focuses on three main components of a Speech Emotion Recognition (SER) Process. The first one is the optimal feature extraction method for Punjabi SER system. The second one is the use of an appropriate feature selection method that desires to select effectual features from the ones extracted in the first step, and removes the redundant features, to improve the conduct of emotion recognition. The third one is the classification model that has been used further for emotion recognition. So, the scope of this paper is to explain the three main steps of Punjabi SER system, feature extraction, feature selection, and emotion recognition with classifier. The results have been calculated and compared for number of feature set combinations, with and without feature selection process. A total of 10 experiments are carried out and various performance metrics such as precision, recall, F1-score, accuracy, etc. are used to demonstrate the results.

show abstract

Section: Pre-processingmentioning

confidence: 99%

Impact of Feature Extraction and Feature Selection Algorithms on Punjabi Speech Emotion Recognition Using Convolutional Neural Network

Kaur

Singh

2022

ACM Trans. Asian Low-Resour. Lang. Inf. Process.

View full text Add to dashboard Cite

show abstract

“…With the help of an adaptation algorithm, ANC minimizes the mean square error value of the output. It generates an output which is the best approximation of the anticipated signal in the sense of being the minimum mean square error (Taha et al, 2018). ANC removes or suppresses a noisy signal by using Adaptive-Filters and adjusting their parameters according to an optimization algorithm, as in Fig.…”

Section: Noise Cancellation Using Adaptive Filtersmentioning

confidence: 99%

Speech Enhancement Based on Adaptive Noise Cancellation and Particle Swarm Optimization

Taha¹,

Wajid²,

Hussain³

2019

Journal of Computer Science

Self Cite

View full text Add to dashboard Cite

Speech enhancement is used in almost all modern communication systems. This is due to the quality of speech being degraded by environmental interference factors, such as: Acoustic additive noise, acoustic reverberation or white Gaussian noise. This paper, explores the potential of different benchmark optimization techniques for enhancing the speech signal. This is accomplished by fine tuning filter coefficients using a diverse set of adaptive filters for noise suppression in speech signals. We consider the Particle Swarm Optimization (PSO) and its variants in conjunction with the Adaptive Noise Cancellation (ANC) approach, for delivering dual speech enhancement. Comparative simulation results demonstrate the potential of an optimized coefficient ANC over a fixed one. Experiments are performed at different signal to noise ratios (SNRs), using two benchmark datasets: the NOIZEUS and Arabic dataset. The performance of the proposed algorithms is evaluated by maximising the perceptual evaluation of speech quality (PESQ) and comparing to the audio-only Wiener Filter (AW) and the Adaptive PSO for dual channel (APSOforDual) algorithms.

show abstract

“…Therefore for this environment, speech enhancement or removing noise is an essential module. Speech enhancement or de-noising speech is closely related to restoration of the speech because it reconstructs and restores the signal after degradation of the original clean signal [4].…”

Section: Introductionmentioning

confidence: 99%

Emotion Recognition System of Noisy Speech in Real World Environment

Win¹,

Khine²

2020

IJIGSP

View full text Add to dashboard Cite

Speech is one of the most natural and fundamental means of human computer interaction and the state of human emotion is important in various domains. The recognition of human emotion is become essential in real world application, but speed signal is interrupted with various noises from the real world environments and the recognition performance is reduced by these additional signals of noise and emotion. Therefore this paper focuses to develop emotion recognition system for the noisy signal in the real world environment. Minimum Mean Square Error, MMSE is used as the enhancement technique, Mel-frequency Cepstrum Coefficients (MFCC) features are extracted from the speech signals and the state of the arts classifiers used to recognize the emotional state of the signals. To show the robustness of the proposed system, the experimental results are carried out by using the standard speech emotion database, IEMOCAP, under various SNRs level from 0db to 15db of real world background noise. The results are evaluated for seven emotions and the comparisons are prepared and discussed for various classifiers and for various emotions. The results indicate which classifier is the best for which emotion to facilitate in real world environment, especially in noisiest condition like in sport event.

show abstract

A Survey on Techniques for Enhancing Speech

Cited by 12 publications

References 47 publications

Impact of Feature Extraction and Feature Selection Algorithms on Punjabi Speech Emotion Recognition Using Convolutional Neural Network

Impact of Feature Extraction and Feature Selection Algorithms on Punjabi Speech Emotion Recognition Using Convolutional Neural Network

Speech Enhancement Based on Adaptive Noise Cancellation and Particle Swarm Optimization

Emotion Recognition System of Noisy Speech in Real World Environment

Contact Info

Product

Resources

About