An analytical model has been developed for the warped discrete Hartley transform cepstrum(WDHTC) in a recent work [1]. Along similar lines, the warped discrete cosine transform (WDCT) has since been modelled in a companion paper [2]. These were preceded by empirical studies of the WDCT cepstrum (WDCTC) as applied to speech feature extraction for vowel recognition and speaker identification [3]. In this paper, we derive the theoretical complex cepstrum (TCC) based on the warped discrete sine transform. We argue that the common recipe evolved through these papers may be used as a measure to compare analytically deducible frontend speech recognition schemes. In particular, we show that the WDCTC-based scheme outperforms the present warped discrete sine transform cepstrum (WDSTC)-based scheme and the one based on warped discrete Hartley transform in terms of low variance of features due to reduced spectral dynamic range. Phoneme recognition performance of WDCTC,WDHTC and WDSTC corroborate well with our analytical findings.
INTRODUCTIONIn a vowel recognition experiment, the warped discrete cosine transform cepstrum (WDCTC) has been proposed as a feature [3]. Its enhanced performance over the mel-frequency cepstral coefficients (MFCC) in a vowel recognition and speaker-identification task has been highlighted there. Some significant findings about the WD-CTC such as (i) good vowel class separability, (ii) low variance, (iii) good codebook representation, (iv) robustness to noise, and (v) better approximation towards a Gaussian distribution has been reported [4,5]. Moreover, MFCC has proved to be difficult to analyze [6]. The use of WDCT in computing the WDCTC gives a significant advantage in analysis [2]. A preliminary comparison between the analytical models of the warped discrete Hartley transform (WDHTC)-based and WDCTC-based schemes has been reported [1].The strategy we adopt in this paper is founded on an appraisal that it is desirable to have a platform to compare algorithms based on different transforms. Such a platform may be built by validating the analytically developed models vis-a-vis recognition experiments on the TIMIT database. It will then lend credence to rank order the schemes based on different transforms as we will be in a position to compare across the analytical models of the schemes themselves. A warped discrete sine transform cepstrum (WDSTC)-based frontend extractor is first proposed and modelled analytically in this paper. As a next logical step forward, we compare the WDHTC-, WDCTC-and WDSTC-based analytical models.Our approach here towards deriving the theoretical complex cepstrum (TCC) of the WDST is similar to the methodology in [1,2]. We sketch relevant details of the approach there, placing