Pablo Cabañas-Molero scite author profile

In this paper, unsupervised learning is used to separate percussive and harmonic sounds from monaural non-vocal polyphonic signals. Our algorithm is based on a modified non-negative matrix factorization (NMF) procedure that no labeled data is required to distinguish between percussive and harmonic bases because information from percussive and harmonic sounds is integrated into the decomposition process. NMF is performed in this process by assuming that harmonic sounds exhibit spectral sparseness (narrowband sounds) and temporal smoothness (steady sounds), whereas percussive sounds exhibit spectral smoothness (broadband sounds) and temporal sparseness (transient sounds). The evaluation is performed using several real-world excerpts from different musical genres. Comparing the developed approach to three current state-of-the art separation systems produces promising results.

show abstract

Constrained non-negative matrix factorization for score-informed piano music restoration

Canadas-Quesada

Martinez-Munoz

Ruiz-Reyes

et al. 2016

Digital Signal Processing

View full text Add to dashboard Cite

Low-complexity F0-based speech/nonspeech discrimination approach for digital hearing aids

Cabañas-Molero

Ruiz-Reyes

Bascón

2010

Multimed Tools Appl

View full text Add to dashboard Cite

Multimodal speaker diarization for meetings using volume-evaluated SRP-PHAT and video analysis

Cabañas-Molero

Lucena

Fuertes

et al. 2018

Multimed Tools Appl

View full text Add to dashboard Cite

HReMAS: hybrid real-time musical alignment system

et al. 2018

View full text Add to dashboard Cite

This paper presents a real-time audio-to-score alignment system for musical applications. The aim of these systems is to synchronize a live musical performance with its symbolic representation in a music sheet. We have used as a base our previous real-time alignment system by enhancing it with a traceback stage, a stage used in offline alignment to improve the accuracy of the aligned note. This stage introduces some delay, what forces to assume a trade-off between output delay and alignment accuracy that must be considered in the design of this type of hybrid techniques. We have also improved our former system to execute faster in order to minimize this delay. Other interesting improvements, like identification of silence frames, have also been incorporated to our proposed system.

show abstract

Multichannel Blind Music Source Separation Using Directivity-Aware MNMF With Harmonicity Constraints

et al. 2022

View full text Add to dashboard Cite

In this paper we present a harmonic constrained Multichannel Non-Negative Matrix Factorization (MNMF) method for the task of blind music source separation. In this model, the mixing filter encodes the spatial information in terms of magnitude and phase differences between channels whereas the source variances are modeled using a harmonic constrained NMF structure. In this work, the spatial covariance matrix is obtained from the constant-Q transform to account to the frequency logarithmic scale inherent in music signals and reduce the dimensionality of the parameters. Moreover, to mitigate the strong sensitivity to parameter initialization, we propose to initialize the spatial weights with the output of the steered response power (SRP) with phase transform (PHAT) algorithm. The proposed method has been evaluated for the task of music source separation using a multichannel classical chamber music dataset with several polyphony and reverberation setups. Furthermore, comparison with other state-of-the-art signal decomposition methods have been accomplished showing reliable results in terms of BSS_EVAL metrics.

show abstract

A Robust Pitch Extractor Based on DTW Lines and CASA with Application in Noisy Speech Recognition

Morales-Cordovilla

Cabañas-Molero

Peinado

et al. 2012

View full text Add to dashboard Cite

Voicing detection based on adaptive aperiodicity thresholding for speech enhancement in non‐stationary noise

Cabañas-Molero¹,

Martinez-Munoz²,

Ruiz-Reyes³

et al. 2014

IET signal process.

View full text Add to dashboard Cite

In this study, the authors present a novel voicing detection algorithm which employs the well-known aperiodicity measure to detect voiced speech in signals contaminated with non-stationary noise. The method computes a signal-adaptive decision threshold which takes into account the current noise level, enabling voicing detection by direct comparison with the extracted aperiodicity. This adaptive threshold is updated at each frame by making a simple estimate of the current noise power, and thus is adapted to fluctuating noise conditions. Once the aperiodicity is computed, the method only requires a small number of operations, and enables its implementation in challenging devices (such as hearing aids) if an efficient approximation of the difference function is employed to extract the aperiodicity. Evaluation over a database of speech sentences degraded by several types of noise reveals that the proposed voicing classifier is robust against different noises and signal-to-noise ratios. In addition, to evaluate the applicability of the method for speech enhancement, a simple F 0-based speech enhancement algorithm integrating the proposed classifier is implemented. The system is shown to achieve competitive results, in terms of objective measures, when compared with other well-known speech enhancement approaches.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.