Joint Mixing Vector and Binaural Model Based Stereo Source Separation

Alinaghi, Atiyeh; Jackson, Philip J. B.; Liu, Qingju; Wang, Wenwu

doi:10.1109/taslp.2014.2320637

Cited by 34 publications

(87 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For binaural recordings, the state-of-the-art blind source separation (BSS) methods [11,12,13] using interaural level difference (ILD) and interaural phase difference (IPD) have demonstrated good performance for two-channel source separation. These BSS methods can largely preserve binaural cues, as well as maintain the energy of each sound source, which is vital for speech intelligibility in noise.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Predicting Binaural Speech Intelligibility from Signals Estimated by a Blind Source Separation Algorithm

Liu¹,

Tang²,

Jackson³

et al. 2016

Interspeech 2016

View full text Add to dashboard Cite

State-of-the-art binaural objective intelligibility measures (OIMs) require individual source signals for making intelligibility predictions, limiting their usability in real-time online operations. This limitation may be addressed by a blind source separation (BSS) process, which is able to extract the underlying sources from a mixture. In this study, a speech source is presented with either a stationary noise masker or a fluctuating noise masker whose azimuth varies in a horizontal plane, at two speech-to-noise ratios (SNRs). Three binaural OIMs are used to predict speech intelligibility from the signals separated by a BSS algorithm. The model predictions are compared with listeners' word identification rate in a perceptual listening experiment. The results suggest that with SNR compensation to the BSS-separated speech signal, the OIMs can maintain their predictive power for individual maskers compared to their performance measured from the direct signals. It also reveals that the errors in SNR between the estimated signals are not the only factors that decrease the predictive accuracy of the OIMs with the separated signals. Artefacts or distortions on the estimated signals caused by the BSS algorithm may also be concerns.

show abstract

Section: Introductionmentioning

confidence: 99%

“…1 illustrates the framework of the proposed system. A BSS algorithm [11,12,13] is applied to extract both the target and masker signals. To implement real-time source separation and intelligibility prediction, the separation model is trained offline.…”

Section: Introductionmentioning

confidence: 99%

Predicting Binaural Speech Intelligibility from Signals Estimated by a Blind Source Separation Algorithm

Liu¹,

Tang²,

Jackson³

et al. 2016

Interspeech 2016

View full text Add to dashboard Cite

show abstract

“…Model-based blind source separation for exactly determined and underdetermined speech mixtures such as [1], [7], [8], [9], are more recent examples of applications in speech analysis involving frequency-specific GMMs. These methods have gained significant popularity due to their simple modelbased approach for integration of cues.…”

mentioning

confidence: 99%

“…More generally, this is due to the absence of an explicit model for reverberation. In addition to this, the frequency domain GMM in these algorithms, [1], [7], [8], relies on the assumption of the cues being independent. As noted in [8, sec.…”

mentioning

confidence: 99%

See 1 more Smart Citation

Bootstrap Averaging for Model-Based Source Separation in Reverberant Conditions

Chandna

Wang

2018

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

Abstract-Recently proposed model-based methods use timefrequency (T-F) masking for source separation, where the T-F masks are derived from various cues described by a frequency domain Gaussian Mixture Model (GMM). These methods work well for separating mixtures recorded in low-to-medium level of reverberation, however, their performance degrades as the level of reverberation is increased. We note that the relatively poor performance of these methods under reverberant conditions can be attributed to the high variance of the frequency-dependent GMM parameter estimates. To address this limitation, a novel bootstrap-based approach is proposed to improve the accuracy of expectation maximization (EM) estimates of a frequencydependent GMM based on an a priori chosen initialization scheme. It is shown how the proposed technique allows us to construct time-frequency masks which lead to improved model-based source separation for reverberant speech mixtures. Experiments and analysis are performed on speech mixtures formed using real room-recorded impulse responses.

show abstract

A New Sparse Blind Source Separation Method for Determined Linear Convolutive Mixtures in Time-Frequency Domain

Bella

Saylani

2020

Lecture Notes in Computer Science

View full text Add to dashboard Cite

This paper presents a new Blind Source Separation method for linear convolutive mixtures, which exploits the sparsity of source signals in the time-frequency domain. This method especially brings a solution to the artifacts problem that affects the quality of signals separated by existing time-frequency methods. These artifacts are in fact introduced by a time-frequency masking operation, used by all these methods. Indeed, by focusing on the case of determined mixtures, we show that this problem can be solved with much less restrictive sparsity assumptions than those of existing methods. Test results show the superiority of our new proposed method over existing ones based on time-frequency masking.

show abstract

Joint Mixing Vector and Binaural Model Based Stereo Source Separation

Cited by 34 publications

References 33 publications

Predicting Binaural Speech Intelligibility from Signals Estimated by a Blind Source Separation Algorithm

Predicting Binaural Speech Intelligibility from Signals Estimated by a Blind Source Separation Algorithm

Bootstrap Averaging for Model-Based Source Separation in Reverberant Conditions

A New Sparse Blind Source Separation Method for Determined Linear Convolutive Mixtures in Time-Frequency Domain

Contact Info

Product

Resources

About