Improved subband-forward algorithm for acoustic noise reduction and speech quality enhancement

Djendi, Mohamed; Bendoumia, Rédha

doi:10.1016/j.asoc.2016.01.049

Cited by 14 publications

(4 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this section, we present the forward blind source separation (BSS) structure and we give its full formulation and optimal solutions in the time-domain. This structure is intensively used in acoustic noise cancellation [10,[16][17][18][19]. The two-channel forward BSS structure is presented in Figure 2.At the output of this structure, the…”

Section: Ii2 Two-channel Forward Structurementioning

confidence: 99%

Efficient Sparse Blind Source Separation Algorithm for two- Channel Acoustic Noise Reduction

Bendoumia

2019

AJRESD

Self Cite

View full text Add to dashboard Cite

Recently, the acoustic noise reduction problem is treated by twochannel forward blind source separation (BSS) techniques combined with normalized least mean square algorithm (T-FNLMS). The TFNLMS algorithm shows good performances in two-channel convolutive dispersive mixture. In this paper, we propose new BSS structure based on the two-channel sparse normalized least mean square algorithm (TS-NLMS). The TS-NLMS algorithm is proposed exactly when the convolutive mixing system is characterized by sparse impulse responses. To confirm the good performance of this proposed algorithm, intensive experiments are done in acoustic noise reduction.

show abstract

Section: Ii2 Two-channel Forward Structurementioning

confidence: 99%

Efficient Sparse Blind Source Separation Algorithm for two- Channel Acoustic Noise Reduction

Bendoumia

2019

AJRESD

Self Cite

View full text Add to dashboard Cite

show abstract

“…Nowadays, speech enhancement has been widely used in the fields of speech analysis, speech recognition, speech communication, and so forth. The aim of speech enhancement is to recover and improve the speech quality and its intelligibility via different techniques and algorithms, like unsupervised methods including spectral subtraction [1,2], Wiener filtering [3], statistical model-based estimation [4,5], subband forward algorithm [6], subspace method [5,7], and so on. Generally these unsupervised methods are based on statistical signal processing and typically work in the frequency domain.…”

Section: Introductionmentioning

confidence: 99%

Speech Enhancement Based on Fusion of Both Magnitude/Phase-Aware Features and Targets

Lang

Yang

2020

Electronics

View full text Add to dashboard Cite

Recently, supervised learning methods have shown promising performance, especially deep neural network-based (DNN) methods, in the application of single-channel speech enhancement. Generally, those approaches extract the acoustic features directly from the noisy speech to train a magnitude-aware target. In this paper, we propose to extract the acoustic features not only from the noisy speech but also from the pre-estimated speech, noise and phase separately, then fuse them into a new complementary feature for the purpose of obtaining more discriminative acoustic representation. In addition, on the basis of learning a magnitude-aware target, we also utilize the fusion feature to learn a phase-aware target, thereby further improving the accuracy of the recovered speech. We conduct extensive experiments, including performance comparison with some typical existing methods, generalization ability evaluation on unseen noise, ablation study, and subjective test by human listener, to demonstrate the feasibility and effectiveness of the proposed method. Experimental results prove that the proposed method has the ability to improve the quality and intelligibility of the reconstructed speech.

show abstract

“…The performances are evaluated by considering an IEEE corpus, the GRID audio-visual corpus, and different types of noises. The proposed approach significantly improves objective speech quality and intelligibility and outperforms the conventional STFT-NMF, DWPT-NMF, and DNN-IRM methods.Keywords: Dual-tree complex wavelet transform (DTCWT); discrete wavelet packet transform (DWPT); stationary wavelet transform (SWT); speech enhancement (SE) 2 of 18 estimation [5], sparseness and temporal gradient regularization method [6], Wiener filtering [7], subband forward algorithm [8], and subspace method [9]. These methods consist of two parts: Noise tracking and signal gain estimation.…”

mentioning

confidence: 99%

“…Keywords: Dual-tree complex wavelet transform (DTCWT); discrete wavelet packet transform (DWPT); stationary wavelet transform (SWT); speech enhancement (SE) 2 of 18 estimation [5], sparseness and temporal gradient regularization method [6], Wiener filtering [7], subband forward algorithm [8], and subspace method [9]. These methods consist of two parts: Noise tracking and signal gain estimation.…”

mentioning

confidence: 99%

Supervised Single Channel Speech Enhancement Based on Dual-Tree Complex Wavelet Transforms and Nonnegative Matrix Factorization Using the Joint Learning Process and Subband Smooth Ratio Mask

et al. 2019

View full text Add to dashboard Cite

In this paper, we propose a novel speech enhancement method based on dual-tree complex wavelet transforms (DTCWT) and nonnegative matrix factorization (NMF) that exploits the subband smooth ratio mask (ssRM) through a joint learning process. The discrete wavelet packet transform (DWPT) suffers the absence of shift invariance, due to downsampling after the filtering process, resulting in a reconstructed signal with significant noise. The redundant stationary wavelet transform (SWT) can solve this shift invariance problem. In this respect, we use efficient DTCWT with a shift invariance property and limited redundancy and calculate the ratio masks (RMs) between the clean training speech and noisy speech (i.e., training noise mixed with clean speech). We also compute RMs between the noise and noisy speech and then learn both RMs with their corresponding clean training clean speech and noise. The auto-regressive moving average (ARMA) filtering process is applied before NMF in previously generated matrices for smooth decomposition. An ssRM is proposed to exploit the advantage of the joint use of the standard ratio mask (sRM) and square root ratio mask (srRM). In short, the DTCWT produces a set of subband signals employing the time-domain signal. Subsequently, the framing scheme is applied to each subband signal to form matrices and calculates the RMs before concatenation with the previously generated matrices. The ARMA filter is implemented in the nonnegative matrix, which is formed by considering the absolute value. Through ssRM, speech components are detected using NMF in each newly formed matrix. Finally, the enhanced speech signal is obtained via the inverse DTCWT (IDTCWT). The performances are evaluated by considering an IEEE corpus, the GRID audio-visual corpus, and different types of noises. The proposed approach significantly improves objective speech quality and intelligibility and outperforms the conventional STFT-NMF, DWPT-NMF, and DNN-IRM methods.

show abstract

Improved subband-forward algorithm for acoustic noise reduction and speech quality enhancement

Cited by 14 publications

References 31 publications

Efficient Sparse Blind Source Separation Algorithm for two- Channel Acoustic Noise Reduction

Efficient Sparse Blind Source Separation Algorithm for two- Channel Acoustic Noise Reduction

Speech Enhancement Based on Fusion of Both Magnitude/Phase-Aware Features and Targets

Supervised Single Channel Speech Enhancement Based on Dual-Tree Complex Wavelet Transforms and Nonnegative Matrix Factorization Using the Joint Learning Process and Subband Smooth Ratio Mask

Contact Info

Product

Resources

About