Perceptual measurement is still the most common method for assessing disordered speech in clinical practice. The subjectivity of such a measure, strongly due to human nature, but also to its lack of interpretation with regard to local alterations in speech units, strongly motivates a sophisticated tool for objective evaluation. Of interest is the increasing performance of Deep Neural Networks in speech applications, but more importantly the fact that they are no longer considered as black boxes. The work carried out here is the first step in a long-term research project, which aims to determine the linguistic units that contribute most to the maintenance or loss of the intelligibility in speech disorders. In this context, we study a CNN trained on normal speech for a classification task of phones and tested on pathological speech. The aim of this first study is to analyze the response of the CNN model to disordered speech in order to study later its effectiveness in providing relevant knowledge in terms of speech severity or loss of intelligibility. Compared to perceptual severity and intelligibility measures, the results revealed a very strong correlation between these metrics and our classifier performance scores, which is very promising for future work.
Apart from the impressive performance it has achieved in several tasks, one of the most important factors remaining for the continuous progress of deep learning is the increased work related to interpretability, especially in a medical context. In a recent work, we presented competitive performance achieved with a CNN-based model trained on normal speech for the French phone classification and how it correlates well with different perceptual measures when exposed to disordered speech. This paper extends that work by focusing on interpretability. Here, the goal is to get insights into the way in which neural representations shape the final task of phone classification so that it can be used further to explain the loss of intelligibility in disordered speech. In this way, an original framework is proposed, relying firstly on the neural activity and a novel representation per neuron, here considering the phone classification, and, secondly, permitting to identify a set of neurons devoted to the detection of specific phonetic traits on normal speech. Faced to disordered speech, a degradation of that set of neurons is observed, demonstrating a loss of specific phonetic traits in some patients involved, and the potentiality of the proposed approaches to inform about speech alteration.
Recently, we have proposed a general analytical framework, called Neuro-based Concept Detector (NCD), to interpret the deep representations of a DNN. Based on the activation patterns of hidden neurons, this framework highlights the ability of neurons to detect a specific concept related to the final task. Its main strength is to provide an interpretability tool for any type of DNN performing a classification task, whatever the application domain. Thanks to NCD, we have demonstrated the emergence of phonetic features in the classification layers of a CNNbased model for French phone classification. The emergence of this concept, of great interest in the field of clinical phonetics, has been studied considering healthy speech. Applied to Head and Neck Cancers, we have shown that this framework automatically reflects the level of impairment of the phonetic features produced by a patient, which is supported by the strong correlations with perceptual assessments performed by clinical experts. The objective of the work presented here is to validate the proposed framework by confronting it to new populations of patients, but with very different pathologies (neurodegenerative diseases/ Dysarthria and vocal dysfunction/ Dysphonia). The robustness of the approach to the phonetic content variability of read text is also studied.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.