Abraham Alcaim scite author profile

This paper presents several strategies to improve the performance of very low bit rate speech coders and describes a speech codec that incorporates these strategies and operates at an average bit rate of 1.2 kb/s. The encoding algorithm is based on several improvements in a mixed multiband excitation (MMBE) linear predictive coding (LPC) structure. A switched-predictive vector quantiser technique that outperforms previously reported schemes is adopted to encode the LSF parameters. Spectral and sound specific low rate models are used in order to achieve high quality speech at low rates. An MMBE approach with three sub-bands is employed to encode voiced frames, while fricatives and stops modelling and synthesis techniques are used for unvoiced frames. This strategy is shown to provide good quality synthesised speech, at a bit rate of only 0.4 kb/s for unvoiced frames. To reduce coding noise and improve decoded speech, spectral envelope restoration combined with noise reduction (SERNR) postfilter is used. The contributions of the techniques described in this paper are separately assessed and then combined in the design of a low bit rate codec that is evaluated against the North American Mixed Excitation Linear Prediction (MELP) coder. The performance assessment is carried out in terms of the spectral distortion of LSF quantisation, mean opinion score (MOS), A/B comparison tests and the ITU-T P.862 perceptual evaluation of speech quality (PESQ) standard. Assessment results show that the improved methods for LSF quantisation, sound specific modelling and synthesis and the new postfiltering approach can significantly outperform previously reported techniques. Further results also indicate that a system combining the proposed improvements and operating at 1.2 kb/s, is comparable (slightly outperforming) a MELP coder operating at 2.4 kb/s. For tandem connection situations, the proposed system is clearly superior to the MELP coder. R.C. de Lamare and A.

show abstract

Analysis of LSF switched-predictive vector quantisers

Lamare

Alcaim²

View full text Add to dashboard Cite

S wit ched-predictive vector quantisat ion (SPVQ) has proven to be m o r e efficient than conventional predictive vector quantisat,ion (PVQ) to reduce the bit rate of LSF parameters in speech coding applications. In this paper, we perform a comparison between different SPVQ schemes and we propose an extension of the SPVQ technique which outperforms previously reported PVQ and SPVQ systems. The proposed method consists of 4 vector quantisers (VQ) which are switched o n the basis of the best performance criterion. 0-7803-6703-0/01/%10.00 02001 IEEE

show abstract

Adaptive weighting of subband-classifier responses for robust text-independent speaker recognition

Vale

Alcaim

2008

Electron. Lett.

View full text Add to dashboard Cite

Sound specific modelling and synthesis with a new postfiltering in low bit rate speech coding

Lamare

Silva²,

Alcaim³

View full text Add to dashboard Cite

In this paper we investigate the use of fricatives and stops modelling and synthesis techniques with a spectral envelope reconstruction combined with noise reduction postfilter (SERNR) in mixed voiced-unvoiced multiband excitation coders. We perform a comparative analysis amongst a noise excitation approach operating at 1.75 kbls and a fricatives and stops excitation technique operating at 0.4 kb/s. A novel SERNR postfiltering technique that significantly enhances the decoded speech is proposed and compared with the well-known adaptive spectral enhancement (ASE) filter.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Abraham Alcaim

Text-independent speaker recognition based on the Hurst parameter and the multidimensional fractional Brownian motion model

Strategies to improve the performance of very low bit rate speech coders and application to a variable rate 1.2 kb/s codec

Analysis of LSF switched-predictive vector quantisers

Adaptive weighting of subband-classifier responses for robust text-independent speaker recognition

Sound specific modelling and synthesis with a new postfiltering in low bit rate speech coding

Contact Info

Product

Resources

About