Speech is a complex process that can break in many different ways and lead to a variety of voice disorders. Dysarthria is a voice disorder where individuals are unable to control one or more of the aspects of speech-the articulation, breathing, voicing, or prosody-leading to less intelligible speech. In this paper, we evaluate the accuracy of state-of-the-art automatic speech recognition systems (ASRs) on two dysarthric speech datasets and compare the results to ASR performance on control speech. The limits of ASR performance using different voices have not been explored since the field has shifted from generative models of speech recognition to deep neural network architectures. To test how far the field has come in recognizing disordered speech, we test two different ASR systems: (1) Carnegie Mellon University's Sphinx Open Source Recognition, and (2) Google®Speech Recognition. While (1) uses generative models of speech recognition, (2) uses deep neural networks. As expected, while (2) achieved lower word error rates (WER) on dysarthric speech than (1), control speech had a WER 59% lower than dysarthric speech. Future studies should be focused not only on making ASRs robust to environmental noise, but also more robust to different voices.
Abstract-Mutual Information (MI) is often used for feature selection when developing classifier models. Estimating the MI for a subset of features is often intractable. We demonstrate, that under the assumptions of conditional independence, MI between a subset of features can be expressed as the Conditional Mutual Information (CMI) between pairs of features. But selecting features with the highest CMI turns out to be a hard combinatorial problem. In this work, we have applied two unique global methods, Truncated Power Method (TPower) and Low Rank Bilinear Approximation (LowRank), to solve the feature selection problem. These algorithms provide very good approximations to the NP-hard CMI based feature selection problem. We experimentally demonstrate the effectiveness of these procedures across multiple datasets and compare them with existing MI based global and iterative feature selection procedures.
I. INTRODUCTIONHigh dimensional data can pose a significant challenge to learning methods due to the curse of dimensionality [1]. Feature selection is a prominent dimensionality reduction technique that selects a small subset of features based on certain relevancy criteria. Apart from reducing data dimensionality, feature selection provides insights into the data, prevents overfitting and reduces computational costs for learning, which ultimately results in better learned models. Depending on whether there is label information available, feature selection can be classified into two categories: supervised and unsupervised. Supervised feature selection procedures are broadly classified into three groups, wrapper, filter and embedded methods [2]. Wrapper procedures select features for a specific learning model. Filter methods on the other hand are classifier agnostic. Feature selection and model learning are treated as two separate steps. These procedures rely on statistical characteristics of the data such as correlation, distance and information, to select the most important features. Embedded procedures incorporate feature selection as part of the learning model, as seen in neural nets. We focus on the model-independent filter procedures for feature selection, because of their classifier independence, simplicity and computational efficiency [3]. Specifically, we consider Mutual Information (MI) based criteria for feature selection. MI is a probabilistic measure that captures the 'correlation' between random variables (see Figure (1)). Whereas standard correlation captures linear relationships between variables, MI can capture non-linear dependencies between variables [4]. Since our aim is to develop better classifier models using feature selection, we select the best subset of features that together have the highest MI with the class variable. Estimating MI between a subset of features requires the estimation
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.