A Deep-Learning Based Framework for Source Separation, Analysis, and Synthesis of Choral Ensembles

Chandna, Pankaj; Cuesta, Helena; Petermann, Darius

doi:10.3389/frsip.2022.808594

Cited by 8 publications

(7 citation statements)

References 32 publications

(41 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Homogeneous audio sources are not easily distinguishable in the time-frequency domain and pose a permutation problem [20], [21]. While permutation-invariant training is used for supervised speech separation [21], [22], methods for musical homogeneous source separation exploit side-information such as F0 estimates [23], [11] or a musical score [24], [9], [10] to guide the separation.…”

Section: Related Workmentioning

confidence: 99%

“…In this context, a choir is composed of four homogeneous sources: a soprano, alto, tenor, and a bass singer. Petermann et al [23] modified the conditioned U-Net [25] so that the target source can be selected and separated using its F0 information. Results show that this leads to improved objective separation quality compared to using non-informed source-specific models.…”

Section: Related Workmentioning

confidence: 99%

“…The method proposed in this paper uses F0 information to separate the (possibly homogeneous) sources like the learning free-methods of [9], [11] and the supervised methods of [23], [24]. It provides better performance than learning-free methods and does not require expensive labeled data like supervised methods.…”

Section: Related Workmentioning

confidence: 99%

“…Furthermore, we train the F0-informed supervised deep learning approach for vocal ensemble separation proposed by Petermann et al [23] on our data. They use a classical U-Net architecture with a control mechanism [25].…”

Section: Baselinesmentioning

confidence: 99%

See 3 more Smart Citations

Unsupervised Music Source Separation Using Differentiable Parametric Source Models

Schulze-Forster

Richard

Kelley

et al. 2023

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

Supervised deep learning approaches to underdetermined audio source separation achieve state-of-the-art performance but require a dataset of mixtures along with their corresponding isolated source signals. Such datasets can be extremely costly to obtain for musical mixtures. This raises a need for unsupervised methods. We propose a novel unsupervised modelbased deep learning approach to musical source separation. Each source is modelled with a differentiable parametric sourcefilter model. A neural network is trained to reconstruct the observed mixture as a sum of the sources by estimating the source models' parameters given their fundamental frequencies. At test time, soft masks are obtained from the synthesized source signals. The experimental evaluation on a vocal ensemble separation task shows that the proposed method outperforms learning-free methods based on nonnegative matrix factorization and a supervised deep learning baseline. Integrating domain knowledge in the form of source models into a data-driven method leads to high data efficiency: the proposed approach achieves good separation quality even when trained on less than three minutes of audio. This work makes powerful deep learning based separation usable in scenarios where training data with ground truth is expensive or nonexistent.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Baselinesmentioning

confidence: 99%

See 2 more Smart Citations

Unsupervised Music Source Separation Using Differentiable Parametric Source Models

Schulze-Forster

Richard

Kelley

et al. 2023

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

show abstract

“…Our models consistently outperform Petermann et al's[21] U-Net model in the SI-SDR metric. The best results were achieved Evaluation of proposed approaches on CSD (Test dataset) using source separation and pitch accuracy metrics, for the BC1Song (a) and BCBSQ (b) training datasets.…”

mentioning

confidence: 57%

A Fully Differentiable Model for Unsupervised Singing Voice Separation

Richard,

Chouteau,

Torres

2024

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

A novel model was recently proposed by for unsupervised music source separation. This model allows to tackle some of the major shortcomings of existing source separation frameworks. Specifically, it eliminates the need for isolated sources during training, performs efficiently with limited data, and can handle homogeneous sources (such as singing voice). But, this model relies on an external multipitch estimator and incorporates an Ad hoc voice assignment procedure. In this paper, we propose to extend this framework and to build a fully differentiable model by integrating a multipitch estimator and a novel differentiable assignment module within the core model. We show the merits of our approach through a set of experiments, and we highlight in particular its potential for processing diverse and unseen data.

show abstract

DLVS4Audio2Sheet: Deep Learning-Based Vocal Separation for Audio into Music Sheet Conversion

Teo,

Wang,

Ghe

et al. 2024

Lecture Notes in Computer Science

View full text Add to dashboard Cite

A Deep-Learning Based Framework for Source Separation, Analysis, and Synthesis of Choral Ensembles

Cited by 8 publications

References 32 publications

Unsupervised Music Source Separation Using Differentiable Parametric Source Models

Unsupervised Music Source Separation Using Differentiable Parametric Source Models

A Fully Differentiable Model for Unsupervised Singing Voice Separation

DLVS4Audio2Sheet: Deep Learning-Based Vocal Separation for Audio into Music Sheet Conversion

Contact Info

Product

Resources

About