Stereophonic channel decorrelation using a binaural masking model

Yang, Hefei; Wang, Jie; Zheng, Chengshi; Li, Xiaodong

doi:10.1016/j.apacoust.2016.03.016

Cited by 3 publications

(6 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The RIRs are generated using the image method [23]. Three sizes of near-end and far-end rooms are selected, which are 4, 3, 3 m, [6,4,3] m, and [8,7,3] m. The Reverberation time is set to be 0.3 s, 0.6 s, and 0.9 s. The distance between loudspeakers microphones are set to be 2.0 m and 0.4 m, respectively. The distance between each speaker position and the center of microphones is set to be [0.3, 0.7, 1.1]0.7 m. The near-end speech is mixed with the echo signals at a signal-to-echo ratio (SER) randomly chosen from [0, 5, 10, 15] dB.…”

Section: Methodsmentioning

confidence: 99%

“…Then the 320-point STFT is applied, leading to a 161-dimensional spectral feature in each frame. The enconders of three modules have five layers with the number of channels are [8,16,32,64,128] for each layer in turn. Accordingly, the number of chanels of decoders are [64,32,16,8,1].…”

Section: Methodsmentioning

confidence: 99%

“…Romoli et al [7] utilized the missing fundamental phenomenon to decorrelate the stereophonic channels by suppressing the fundamental frequency component of one far-end signal frame by frame. In order to achieve better performance, a hybrid decorrelation methods have been proposed in [8], where an improved sinusoidal phase modulation was applied in the high-frequency band and a pitch-driven sinusoidal injection scheme with a simplified binaural masking model was adopted in the low-frequency band. Although these decorrela-tion methods can mitigate the nonuniqueness to a certain extent, almost all of them will degrade the audio quality and stereophonic spatial perception to some degree.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

A deep complex multi-frame filtering network for stereophonic acoustic echo cancellation

Cheng¹,

Zheng²,

Li³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

In hands-free communication system, the coupling between the loudspeaker and the microphone will generate echo signal, which can severely impair the quality of communication. Meanwhile, various types of noise in the communication environment further destroy the speech quality and intelligibility. It is hard to extract the near-end signal from the microphone input signal within one step, especially in low signal-to-noise ratios. In this paper, we propose a multi-stage approach to address this issue. On the one hand, we decompose the echo cancellation into two stages, including linear echo cancellation module and residual echo suppression module. A multi-frame filtering strategy is introduced to benefit estimating linear echo by utilizing more inter-frame information. On the other hand, we decouple the complex spectral mapping into magnitude estimation and complex spectra refine. Experimental results demonstrate that our proposed approach achieves stage-of-the-art performance over previous advanced algorithms under various conditions.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A deep complex multi-frame filtering network for stereophonic acoustic echo cancellation

Cheng¹,

Zheng²,

Li³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Such a phenomenon is called acoustic masking [45]. High SPL sound, called masker, makes low SPL sound, called maskee, inaudible.…”

Section: Principles Of Psychoacoustic Maskingmentioning

confidence: 99%

“…ITD is more useful for low-frequency sounds. On average, maximum ITD is about 690 ms, when the sound source is close to one ear [45]. However, humans' heads are not the same size.…”

Section: Binaural Algorithmsmentioning

confidence: 99%

Integrated active noise control and sound quality enhancement system for hearing devices

Belyi¹

View full text Add to dashboard Cite

Deep learning-based stereophonic acoustic echo suppression without decorrelation

Cheng

Peng

et al. 2021

The Journal of the Acoustical Society of America

View full text Add to dashboard Cite

Traditional stereophonic acoustic echo cancellation algorithms need to estimate acoustic echo paths from stereo loudspeakers to a microphone, which often suffers from the nonuniqueness problem caused by a high correlation between the two far-end signals of these stereo loudspeakers. Many decorrelation methods have already been proposed to mitigate this problem. However, these methods may reduce the audio quality and/or stereophonic spatial perception. This paper proposes to use a convolutional recurrent network (CRN) to suppress the stereophonic echo components by estimating a nonlinear gain, which is then multiplied by the complex spectrum of the microphone signal to obtain the estimated near-end speech without a decorrelation procedure. The CRN includes an encoder-decoder module and two-layer gated recurrent network module, which can take advantage of the feature extraction capability of the convolutional neural networks and temporal modeling capability of recurrent neural networks simultaneously. The magnitude spectra of the two far-end signals are used as input features directly without any decorrelation preprocessing and, thus, both the audio quality and stereophonic spatial perception can be maintained. The experimental results in both the simulated and real acoustic environments show that the proposed algorithm outperforms traditional algorithms such as the normalized least-mean square and Wiener algorithms, especially in situations of low signal-to-echo ratio and high reverberation time RT60.

show abstract

Stereophonic channel decorrelation using a binaural masking model

Cited by 3 publications

References 34 publications

A deep complex multi-frame filtering network for stereophonic acoustic echo cancellation

A deep complex multi-frame filtering network for stereophonic acoustic echo cancellation

Integrated active noise control and sound quality enhancement system for hearing devices

Deep learning-based stereophonic acoustic echo suppression without decorrelation

Contact Info

Product

Resources

About