Towards Interpreting Deep Learning Models to Understand Loss of Speech Intelligibility in Speech Disorders Step 2: Contribution of the Emergence of Phonetic Traits

Abderrazek, Sondes; Fredouille, Corinne; Ghio, Alain; Lalain, Muriel; Meunier, Christine; Woisard, Virginie

doi:10.1109/icassp43922.2022.9746198

Cited by 3 publications

(4 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To accomplish this, we calculated the Sørensen-Dice Index (SDI; Sørensen, 1948; Abderrazek et al, 2022), which is a similarity measure that takes into account both the shared elements and the size of the sets being compared to measure similarity between phoneme inventories of several language pairs. The SDI ranges from 0 to 1, where 0 indicates no similarity, and 1 indicates complete similarity.…”

Section: Linguistic Similarity Analysismentioning

confidence: 99%

Exploring audiovisual speech perception in monolingual and bilingual children in Uzbekistan

Nematova,

Zinszer,

Jasińska

2023

Preprint

View full text Add to dashboard Cite

This study aimed to investigate the development of audiovisual speech perception in monolingual Uzbek and bilingual Uzbek-Russian-speaking children, focusing on the impact of language experience on audiovisual speech perception and the role of visual phonetic (i.e. mouth movements corresponding to phonetic/lexical information) and temporal (i.e. timing of speech signals) cues.Three hundred twenty-one children in Tashkent, Uzbekistan, between the ages of 4 and 10 years discriminated /ba/ and /da/ syllables across three conditions: auditory-only, audiovisual phonetic (i.e., the sound accompanied by mouth movements), and audiovisual temporal (i.e, sound onset/offset accompanied by mouth opening/closing). Effects of modality (audiovisual phonetic, audiovisual temporal, or audio-only cues), age, group (monolingual vs. bilingual), and their interactions were tested using a Bayesian regression model.Participants performed better in the audiovisual phonetic modality compared to the auditory modality. However, no significant difference between monolingual and bilingual children was observed across all modalities. This finding stands in contrast with earlier studies. We attribute the contrasting findings of our study and the existing literature to the cross-linguistic similarity of the language pairs involved. When the languages spoken by bilinguals exhibit substantial linguistic similarity, there may be an increased necessity to disambiguate speech signals, leading to a greater reliance on audiovisual cues. The limited phonological similarity between Uzbek and Russian might have minimized bilinguals’ need to rely on visual speech cues, contributing to the lack of group differences in our study.

show abstract

Section: Linguistic Similarity Analysismentioning

confidence: 99%

Exploring audiovisual speech perception in monolingual and bilingual children in Uzbekistan

Nematova,

Zinszer,

Jasińska

2023

Preprint

View full text Add to dashboard Cite

show abstract

“…In this section, we briefly describe the general analytic framework, Neuro-Concept Detector (NCD), proposed in [20]. This framework was designed for the interpretability of the deep rep-resentations of a DNN performing a classification task.…”

Section: The Ncd Frameworkmentioning

confidence: 99%

“…Conversely, if Sn,T x < −0.25, then the neuron n is considered as a detector of the opposite phonetic feature Tx, noted [-Tx]. Experiments conducted in [20] revealed interesting results. Indeed, it showed that interpretable neurons with phonetic feature encoding properties emerge in the fully connected layers of the CNN.…”

Section: The Ncd Frameworkmentioning

confidence: 99%

“…From our knowledge, we can mention [18], involving adults suffering from laryngeal cancers and children with cleft lip and palate, or [19] involving dysarthric and dysphonic speakers, children with cleft lip or palate, speakers with pathological speech secondary to hearing impairment, laryngectomized and glossectomized speakers. Recently, we have proposed an original framework, Neuro-Concept Detector (NCD), for the interpretability of Deep Neural Networks (DNNs) [20]. This framework highlights the ability of hidden neurons to detect a specific concept related to the final task.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Validation of the Neuro-Concept Detector framework for the characterization of speech disorders: A comparative study including Dysarthria and Dysphonia

Abderrazek¹,

Fredouille²,

Ghio³

et al. 2022

Interspeech 2022

View full text Add to dashboard Cite

Recently, we have proposed a general analytical framework, called Neuro-based Concept Detector (NCD), to interpret the deep representations of a DNN. Based on the activation patterns of hidden neurons, this framework highlights the ability of neurons to detect a specific concept related to the final task. Its main strength is to provide an interpretability tool for any type of DNN performing a classification task, whatever the application domain. Thanks to NCD, we have demonstrated the emergence of phonetic features in the classification layers of a CNNbased model for French phone classification. The emergence of this concept, of great interest in the field of clinical phonetics, has been studied considering healthy speech. Applied to Head and Neck Cancers, we have shown that this framework automatically reflects the level of impairment of the phonetic features produced by a patient, which is supported by the strong correlations with perceptual assessments performed by clinical experts. The objective of the work presented here is to validate the proposed framework by confronting it to new populations of patients, but with very different pathologies (neurodegenerative diseases/ Dysarthria and vocal dysfunction/ Dysphonia). The robustness of the approach to the phonetic content variability of read text is also studied.

show abstract

Interpreting Deep Representations of Phonetic Features via Neuro-Based Concept Detector: Application to Speech Disorders Due to Head and Neck Cancer

Abderrazek

Fredouille

Ghio

et al. 2023

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

Towards Interpreting Deep Learning Models to Understand Loss of Speech Intelligibility in Speech Disorders Step 2: Contribution of the Emergence of Phonetic Traits

Cited by 3 publications

References 9 publications

Exploring audiovisual speech perception in monolingual and bilingual children in Uzbekistan

Exploring audiovisual speech perception in monolingual and bilingual children in Uzbekistan

Validation of the Neuro-Concept Detector framework for the characterization of speech disorders: A comparative study including Dysarthria and Dysphonia

Interpreting Deep Representations of Phonetic Features via Neuro-Based Concept Detector: Application to Speech Disorders Due to Head and Neck Cancer

Contact Info

Product

Resources

About