Abstract-A new signal classification approach is presented that is based upon modeling the dynamics of a system as they are captured in a reconstructed phase space. The modeling is done using full covariance Gaussian Mixture Models of time domain signatures, in contrast with current and previous work in signal classification that is typically focused on either linear systems analysis using frequency content or simple nonlinear machine learning models such as artificial neural networks. The proposed approach has strong theoretical foundations based on dynamical systems and topological theorems, resulting in a signal reconstruction, which is asymptotically guaranteed to be a complete representation of the underlying system, given properly chosen parameters. The algorithm automatically calculates these parameters to form appropriate reconstructed phase spaces, requiring only the number of mixtures, the signals, and their class labels as input. Three separate data sets are used for validation, including motor current simulations, electrocardiogram recordings, and speech waveforms. The results show that the proposed method is robust across these diverse domains, significantly outperforming the time delay neural network used as a baseline.
This paper introduces a novel approach to the analysis and classification of time series signals using statistical models of reconstructed phase spaces. With sufficient dimension, such reconstructed phase spaces are, with probability one, guaranteed to be topologically equivalent to the state dynamics of the generating system, and, therefore, may contain information that is absent in analysis and classification methods rooted in linear assumptions. Parametric and nonparametric distributions are introduced as statistical representations over the multidimensional reconstructed phase space, with classification accomplished through methods such as Bayes maximum likelihood and artificial neural networks (ANNs). The technique is demonstrated on heart arrhythmia classification and speech recognition. This new approach is shown to be a viable and effective alternative to traditional signal classification approaches, particularly for signals with strong nonlinear characteristics.
This paper introduces a novel time-domain approach to modeling and classifying speech phoneme waveforms. The approach is based on statistical models of reconstructed phase spaces, which offer significant theoretical benefits as representations that are known to be topologically equivalent to the state dynamics of the underlying production system. The lag and dimension parameters of the reconstruction process for speech are examined in detail, comparing common estimation heuristics for these parameters with corresponding maximum likelihood recognition accuracy over the TIMIT data set. Overall accuracies are compared with a Mel-frequency cepstral baseline system across five different phonetic classes within TIMIT, and a composite classifier using both cepstral and phase space features is developed. Results indicate that although the accuracy of the phase space approach by itself is still NOT THE PUBLISHED VERSION; this is the author's final, peer-reviewed manuscript. The published version may be accessed by following the link in the citation at the bottom of the page.
This paper presents a novel method for speech recognition by utilizing nonlinear/chaotic signal processing techniques to extract time-domain based phase space features. By exploiting the theoretical results derived in nonlinear dynamics, a processing space called a reconstructed phase space can be generated where a salient model (the natural distribution of the attractor) can be extracted for speech recognition. To discover the discriminatory power of these features, isolated phoneme classification experiments were performed using the TIMIT corpus and compared to a baseline classifier that uses MFCC features. The results demonstrate that phase space features contain substantial discriminatory power, even though MFCC features outperformed the phase space features on direct comparisons. The authors conjecture that phase space and MFCC features used in combination within a classifier will yield increased accuracy for various speech recognition tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.