Abstract-The asymptotic covariance matrix of the empirical cepstrum is analyzed. We show that for Gaussian processes, cepstral coefficients derived from smoothed periodograms are asymptotically uncorrelated and their variances multiplied by the sample size T tend to unity. For an autoregressive process and its autoregressive cepstrum estimate, somewhat weaker results hold.
I. INTRODUCTONCepstral analysis is useful in the preprocessing of many speech recognition and speaker verification systems (see, e.g., [ 11- [6]). This is based on strong experimental evidence that among many Manuscript received September 4, 1991; revised June 15, 1992. The associate editor coordinating the review of this correspondence and approving it for publication was Prof. Georgios B. Giannakis.N. Merhav is with the Department of Electrical Engineering, TechnionIsraeli Institute of Technology, Haifa 32000, Israel.C.-H. Lee is with the Speech Research Department, AT&T Bell Laboratories, Murray Hill, NJ 07974.IEEE Log Number 9207542.types of feature vectors, the cepstrum provides the best performance in speech recognition [6] and speaker verification [2] applications. It is of interest, in light of this fact, to investigate the asymptotic statistical properties of the empirical cepstral vector. We examine both analytically (Section 11) and experimentally (Section 111) the covariance matrix of this vector when extracted from a stationary random process in two cases. First, an underlying stationary Gaussian process is assumed and we confine interest to the cepstrum derived from the smoothed periodogram [7]. The cepstral components are shown to be asymptotically uncorrelated and their variances, when multipled by sample size T, tend to unity as T + W . In the second case, an autoregressive (AR) process (not necessarily Gaussian) is assumed and we focus on the cepstrum derived from the empirical AR power spectrum density (PSD), which is a parametric estimator of the PSD. Here the covariance matrix, when multiplied by T, tends to the identity matrix in the weak norm sense (Hilbert-Schmidt), which is a weaker form of convergence than in the former case. Thus, in both cases the asymptotic covariance matrix is, in a sense, equivalent to the identity matrix independently of the underlying PSD.This "orthonormality " property of the cepstral vector regardless of the PSD, does not exist in many other feature vectors commonly used in speech processing, e.g., the AR parameter vector, the vector of reflection coefficients, and the DFT coefficients. It is interesting to note, however, that the log-spectral energies (which are related to the cepstrum via a Fourier transform) do have the above mentioned covariance orthonormality property under some conditions [lo]. This will be discussed more deeply in Section 11.One implication of these results is that, essentially, only the cepstral means carry useful information regarding the PSD, while the cepstral variances are relatively insensitive to the PSD. This observation has been also supported experimentally b...