Speech spectral modeling and enhancement based on autoregressive conditional heteroscedasticity models

Cohen, Israel

doi:10.1016/j.sigpro.2005.06.005

Cited by 52 publications

(37 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The independency assumption in the complex domain is also inconsistent with the data. We think further improvements in speech enhancement performance are still possible by considering more sophisticated pdf models and better spectral variance estimators [15], [31], [32] simultaneously. For certain types of distortions, other methods may be more appropriate.…”

Section: Discussionmentioning

confidence: 99%

“…We see that for small arguments, is approximated well by only a few terms. Substituting (15) into (14) (15) converges and because changing the order of integration and summation as is used in the derivation of (16) is allowed for according to Fubini's theorem [22].…”

Section: ) Change Of Variablementioning

confidence: 99%

“…Further, the Wiener estimator (x) provides the weakest SNR-S versus SNR-N tradeoff; as discussed in Section II, this suggests that the speech distribution conditional on the estimated a priori SNR is not well described by a Gaussian model. The Gaussian model and thus the Wiener estimator may perform better for a different a priori SNR estimator [15]. Also, rather simple modifications of the Wiener estimator have been proposed which significantly boost its performance (see, e.g., [29] and [30,Ch.…”

Section: Complex Dft Estimatorsmentioning

confidence: 99%

“…Hence, it is important to notice that the appropriate distributional assumption is related to the speech variance estimator used. For example, Cohen [15] suggests that for a different a priori SNR estimator based on GARCH models, the Gaussian speech model is superior. A slight preference for complex Gaussian distributions has also been found for the DFT-coefficients from short analysis frames of individual speech sound classes (vowels, plosives, fricatives, etc.)…”

mentioning

confidence: 99%

See 3 more Smart Citations

Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors

Erkelens

Hendriks

Heusdens

et al. 2007

IEEE Trans. Audio Speech Lang. Process.

229

199

View full text Add to dashboard Cite

Section: Discussionmentioning

confidence: 99%

Section: ) Change Of Variablementioning

confidence: 99%

Section: Complex Dft Estimatorsmentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors

Erkelens

Hendriks

Heusdens

et al. 2007

IEEE Trans. Audio Speech Lang. Process.

229

199

View full text Add to dashboard Cite

“…The a priori SNR, which is the ratio of the speech and noise power, is widely used in speech enhancement algorithms and is typically estimated using the decision-directed approach of Ephraim and Malah [6]. Alternative techniques are based on GARCH models [4] and cepstro-temporal smoothing [3].…”

Section: Prior Workmentioning

confidence: 99%

A CASA-Based System for Long-Term SNR Estimation

Narayanan

Wang

2012

IEEE Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

Abstract-We present a system for robust signal-to-noise ratio (SNR) estimation based on computational auditory scene analysis (CASA). The proposed algorithm uses an estimate of the ideal binary mask to segregate a time-frequency representation of the noisy signal into speech dominated and noise dominated regions. Energy within each of these regions is summated to derive the filtered global SNR. An SNR transform is introduced to convert the estimated filtered SNR to the true broadband SNR of the noisy signal. The algorithm is further extended to estimate subband SNRs. Evaluations are done using the TIMIT speech corpus and the NOISEX92 noise database. Results indicate that both global and subband SNR estimates are superior to those of existing methods, especially at low SNR conditions. Index Terms-Computational auditory scene analysis (CASA), broadband SNR, ideal binary mask (IBM), signal-to-noise ratio (SNR), subband SNR.

show abstract

References

2010

GARCH Models

View full text Add to dashboard Cite

Speech spectral modeling and enhancement based on autoregressive conditional heteroscedasticity models

Cited by 52 publications

References 25 publications

Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors

Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors

A CASA-Based System for Long-Term SNR Estimation

References

Contact Info

Product

Resources

About