In the present work we overview some recently proposed discrete Fourier transform (DFT)- and discrete wavelet packet transform (DWPT)-based speech parameterization methods and evaluate their performance on the speech recognition task. Specifically, in order to assess the practical value of these less studied speech parameterization methods, we evaluate them in a common experimental setup and compare their performance against traditional techniques, such as the Mel-frequency cepstral coefficients (MFCC) and perceptual linear predictive (PLP) cepstral coefficients which presently dominate the speech recognition field. In particular, utilizing the well established TIMIT speech corpus and employing the Sphinx-III speech recognizer, we present comparative results of 8 different speech parameterization techniques
This paper presents a multi-sensor fusion strategy for a novel road-matching method designed to support real-time navigational features within advanced driving-assistance systems. Managing multihypotheses is a useful strategy for the road-matching problem. The multi-sensor fusion and multi-modal estimation are realized using Dynamical Bayesian Network. Experimental results, using data from Antilock Braking System (ABS) sensors, a differential Global Positioning System (GPS) receiver and an accurate digital roadmap, illustrate the performances of this approach, especially in ambiguous situations.
Todor Ganchev, Mihalis Siafarikas, Iosif Mporas, and Tsenka Stoyanova, 'Wavelet basis selection for enhanced speech parametrization in speaker verification', International Journal of Speech Technology, vol. 17 (1): 27-36, June 2013, doi: https://doi.org/10.1007/s10772-013-9202-8. Published by Springer US.We study the inherent properties of nine wavelet functions and subsequently evaluate their applicability as basis functions in a speech parametrization scheme that is advantageous for speaker verification. Particularly, the inherent properties of nine candidate basis functions are initially analysed and their advantages and disadvantages are discussed. Subsequently, all candidates are employed in a well-proven speech parametrization scheme, and the resulting speech features are computed. Finally, these speech features are evaluated in a common experimental set-up on the speaker verification task. The experimental results, obtained on two well-known speaker recognition databases, show that the Battle-Lemari?? wavelet function is the most advantageous one, among all other functions evaluated here, since it leads to the most beneficial speech descriptors. When compared to the baseline Mel-frequency cepstral coefficients (MFCC), a relative reduction of the equal error rate by 4.2 % was observed on the 2001 NIST speaker recognition evaluation database, and by 2.3 % on the Polycost speaker recognition database
Exploiting the capabilities offered by the plethora of existing wavelets, together with the powerful set of orthonormal bases provided by wavelet packets, we construct a novel wavelet packet-based set of speech features that is optimized for the task of speaker verification. Our approach differs from previous wavelet-based work, primarily in the wavelet-packet tree design that follows the concept of critical bands, as well as in the particular wavelet basis function that has been used. In comparative experiments, we investigate several alternative speech parameterizations with respect to their usefulness for differentiating among human voices. The experimental results confirm that the proposed speech features outperform Mel-Frequency Cepstral Coefficients (MFCC) and previously used wavelet features on the task of speaker verification. A relative reduction of the equal error rate by 15%, 15% and 8% was observed for the proposed speech features, when compared to the wavelet packet features introduced by Farooq and Datta, the MFCC of Slaney, and the subband based cepstral coefficients of Sarikaya et al., respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.