2012
DOI: 10.1016/j.sigpro.2011.12.022
|View full text |Cite
|
Sign up to set email alerts
|

A tractable framework for estimating and combining spectral source models for audio source separation

Abstract: International audienceThe underdetermined blind audio source separation (BSS) problem is often addressed in the time-frequency (TF) domain assuming that each TF point is modeled as an independent random variable with sparse distribution. On the other hand, methods based on structured spectral model, such as the Spectral Gaussian Scaled Mixture Models (Spectral-GSMMs) or Spectral Non-negative Matrix Factorization models, perform better because they exploit the statistical diversity of audio source spectrograms,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
7
0

Year Published

2012
2012
2015
2015

Publication Types

Select...
4
1

Relationship

3
2

Authors

Journals

citations
Cited by 8 publications
(8 citation statements)
references
References 39 publications
(57 reference statements)
1
7
0
Order By: Relevance
“…The results support the emergence of source separation systems exploiting advanced source models accounting for the source spectra in the case of audio source separation [22,23,24,30] or for signaling pathway information in the case of biomedical source separation [34]. Nevertheless, more conventional methods based on frequency-domain ICA or SCA still perform best on live audio recordings of many sources and/or background noise [25,27,28,29].…”
Section: Remaining Challengessupporting
confidence: 54%
See 1 more Smart Citation
“…The results support the emergence of source separation systems exploiting advanced source models accounting for the source spectra in the case of audio source separation [22,23,24,30] or for signaling pathway information in the case of biomedical source separation [34]. Nevertheless, more conventional methods based on frequency-domain ICA or SCA still perform best on live audio recordings of many sources and/or background noise [25,27,28,29].…”
Section: Remaining Challengessupporting
confidence: 54%
“…• Performance drops on 4-channel mixtures of 4 sources, for which the best 4-channel 2-source separation method [27] achieves a SIR of 3 dB only, and on professionally produced music recordings, for which the best method [30] based on the aforementioned variance modeling framework provided a SIR of 9 dB. This suggests that performance does not depend so much whether the mixture is determined or overdetermined but rather on the number of sources itself, since a larger number of sources makes it more difficult to achieve accurate source localization, which is a prerequisite in most source separation methods.…”
Section: Current Performance On the Other Audio Datasetsmentioning
confidence: 99%
“…In this case, the separation task is underdetermined, and can only be solved by making some assumptions about the sources. These may include sparsity, non-negativity and independence, or may take the form of structured spectral models like NMF models [60], PLCA models [6], spectral Gaussian scaled mixture models (Spectral-GSMMs) [2] or the source-filter model for sound production [62]. Further constraints such as temporal continuity or harmonicity can be employed together with spectral models.…”
Section: Joint Transcription and Source Separationmentioning
confidence: 99%
“…Also, the latter is crucial so as to account for the well-separated regions where artifacts and interferences still remain from the previous separation step 2 . To fulfil this challenge, we exploit uncertainty-based learning [12,13] where source models can be learned from the source estimates, while taking into account separation errors described by their variances. The principle is summarized as follow.…”
Section: Interactive Parameter Update Exploiting T-f Annotationsmentioning
confidence: 99%
“…The principle is summarized as follow. Within the above-mentioned Gaussian assumption, the posterior of j-th source writes [12] p(s j,f n |x f n ; θ) = Nc s j,f n ;ŝ j,f n ,v…”
Section: Interactive Parameter Update Exploiting T-f Annotationsmentioning
confidence: 99%