The aim of this paper is to provide an overview of Sparse Linear Prediction, a set of speech processing tools created by introducing sparsity constraints into the linear prediction framework. These tools have shown to be effective in several issues related to modeling and coding of speech signals. For speech analysis, we provide predictors that are accurate in modeling the speech production process and overcome problems related to traditional linear prediction. In particular, the predictors obtained offer a more effective decoupling of the vocal tract transfer function and its underlying excitation, making it a very efficient method for the analysis of voiced speech. For speech coding, we provide predictors that shape the residual according to the characteristics of the sparse encoding techniques resulting in more straightforward coding strategies. Furthermore, encouraged by the promising application of compressed sensing in signal compression, we investigate its formulation and application to sparse linear predictive coding. The proposed estimators are all solutions to convex optimization problems, which can be solved efficiently and reliably using, e.g., interior-point methods. Extensive experimental results are provided to support the effectiveness of the proposed methods, showing the improvements over traditional linear prediction in both speech analysis and coding.
In this paper, we present a novel method for joint estimation of the fundamental frequency and order of a set of harmonically related sinusoids based on the MUSIC estimation criterion. The presented method, termed HMUSIC, is shown to have an efficient implementation using FFTs. Furthermore, refined estimates can be obtained using a gradient-based method. Illustrative examples of the application of the algorithm to real-life speech and audio signals are given, and the statistical performance of the estimator is evaluated using synthetic signals, demonstrating its good statistical properties.
We investigate the conditions for which nonnegative matrix
factorization (NMF) is unique and introduce several
theorems which can determine whether the decomposition
is in fact unique or not. The theorems are illustrated by
several examples showing the use of the theorems and their
limitations. We have shown that corruption of a unique NMF matrix by additive noise leads to a noisy estimation of the noise-free unique solution. Finally, we use
a stochastic view of NMF to analyze which characterization
of the underlying model will result in an NMF with small
estimation errors.
In this paper, we consider the problem of separating and enhancing periodic signals from single-channel noisy mixtures. More specifically, the problem of designing filters for such tasks is treated. We propose a number of novel filter designs that 1) are specifically aimed at periodic signals, 2) are optimal given the observed signal and thus signal adaptive, 3) offer full parametrizations of periodic signals, and 4) reduce to well-known designs in special cases. The found filters can be used for a multitude of applications including processing of speech and audio signals. Some illustrative signal examples demonstrating its superior properties as compared to other related filters are given and the properties of the various designs are analyzed using synthetic signals in Monte Carlo simulations.
In this paper, we consider the problem of joint direction-of-arrival (DOA) and fundamental frequency estimation. Joint estimation enables robust estimation of these parameters in multi-source scenarios where separate estimators may fail. First, we derive the exact and asymptotic Cramér-Rao bounds for the joint estimation problem. Then, we propose a nonlinear least squares (NLS) and an approximate NLS (aNLS) estimator for joint DOA and fundamental frequency estimation. The proposed estimators are maximum likelihood estimators when: 1) the noise is white Gaussian, 2) the environment is anechoic, and 3) the source of interest is in the far-field. Otherwise, the methods still approximately yield maximum likelihood estimates. Simulations on synthetic data show that the proposed methods have similar or better performance than state-of-the-art methods for DOA and fundamental frequency estimation. Moreover, simulations on real-life data indicate that the NLS and aNLS methods are applicable even when reverberation is present and the noise is not white Gaussian.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.