This paper presents several strategies to improve the performance of very low bit rate speech coders and describes a speech codec that incorporates these strategies and operates at an average bit rate of 1.2 kb/s. The encoding algorithm is based on several improvements in a mixed multiband excitation (MMBE) linear predictive coding (LPC) structure. A switched-predictive vector quantiser technique that outperforms previously reported schemes is adopted to encode the LSF parameters. Spectral and sound specific low rate models are used in order to achieve high quality speech at low rates. An MMBE approach with three sub-bands is employed to encode voiced frames, while fricatives and stops modelling and synthesis techniques are used for unvoiced frames. This strategy is shown to provide good quality synthesised speech, at a bit rate of only 0.4 kb/s for unvoiced frames. To reduce coding noise and improve decoded speech, spectral envelope restoration combined with noise reduction (SERNR) postfilter is used. The contributions of the techniques described in this paper are separately assessed and then combined in the design of a low bit rate codec that is evaluated against the North American Mixed Excitation Linear Prediction (MELP) coder. The performance assessment is carried out in terms of the spectral distortion of LSF quantisation, mean opinion score (MOS), A/B comparison tests and the ITU-T P.862 perceptual evaluation of speech quality (PESQ) standard. Assessment results show that the improved methods for LSF quantisation, sound specific modelling and synthesis and the new postfiltering approach can significantly outperform previously reported techniques. Further results also indicate that a system combining the proposed improvements and operating at 1.2 kb/s, is comparable (slightly outperforming) a MELP coder operating at 2.4 kb/s. For tandem connection situations, the proposed system is clearly superior to the MELP coder. R.C. de Lamare and A.
S wit ched-predictive vector quantisat ion (SPVQ) has proven to be m o r e efficient than conventional predictive vector quantisat,ion (PVQ) to reduce the bit rate of LSF parameters in speech coding applications. In this paper, we perform a comparison between different SPVQ schemes and we propose an extension of the SPVQ technique which outperforms previously reported PVQ and SPVQ systems. The proposed method consists of 4 vector quantisers (VQ) which are switched o n the basis of the best performance criterion. 0-7803-6703-0/01/%10.00 02001 IEEE
In this paper we investigate the use of fricatives and stops modelling and synthesis techniques with a spectral envelope reconstruction combined with noise reduction postfilter (SERNR) in mixed voiced-unvoiced multiband excitation coders. We perform a comparative analysis amongst a noise excitation approach operating at 1.75 kbls and a fricatives and stops excitation technique operating at 0.4 kb/s. A novel SERNR postfiltering technique that significantly enhances the decoded speech is proposed and compared with the well-known adaptive spectral enhancement (ASE) filter.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.