The scope of this work is the separation of N sources from M linear mixtures when the underlying system is underdetermined, that is, when M ¡ N. If the input distribution is sparse the mixing matrix can be estimated either by external optimization or by clustering and, given the mixing matrix, a minimal l 1 norm representation of the sources can be obtained by solving a low-dimensional linear programming problem for each of the data points. Yet, when the signals per se do not satisfy this assumption, sparsity can still be achieved by realizing the separation in a sparser transformed domain. The approach is illustrated here for M = 2. In this case we estimate both the number of sources and the mixing matrix by the maxima of a potential function along the circle of unit length, and we obtain the minimal l 1 norm representation of each data point by a linear combination of the pair of basis vectors that enclose it. Several experiments with music and speech signals show that their time-domain representation is not sparse enough. Yet, excellent results were obtained using their short-time Fourier transform, including the separation of up to six sources from two mixtures.
The blind source separation problem is to extract the underlying source signals from a set of linear mixtures, where the mixing matrix is unknown. This situation is common in acoustics, radio, medical signal and image processing, hyperspectral imaging, and other areas. We suggest a twostage separation process: a priori selection of a possibly overcomplete signal dictionary (for instance, a wavelet frame or a learned dictionary) in which the sources are assumed to be sparsely representable, followed by unmixing the sources by exploiting the their sparse representability. We consider the general case of more sources than mixtures, but also derive a more efficient algorithm in the case of a nonovercomplete dictionary and an equal numbers of sources and mixtures. Experiments with artificial signals and musical sounds demonstrate significantly better separation than other known techniques.
Abstract. This article provides an overview of the first stereo audio source separation evaluation campaign, organized by the authors. Fifteen underdetermined stereo source separation algorithms have been applied to various audio data, including instantaneous, convolutive and real mixtures of speech or music sources. The data and the algorithms are presented and the estimated source signals are compared to reference signals using several objective performance criteria.
We present the outcomes of three recent evaluation campaigns in the field of audio and biomedical source separation. These campaigns have witnessed a boom in the range of applications of source separation systems in the last few years, as shown by the increasing number of datasets from 1 to 9 and the increasing number of submissions from 15 to 34. We first discuss their impact on the definition of a reference evaluation methodology, together with shared datasets and software. We then present the key results obtained over almost all datasets. We conclude by proposing directions for future research and evaluation, based in particular on the ideas raised during the related panel discussion at the Ninth International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA 2010).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.