The goal of the study is to investigate a correlation between different levels of speech organization, indicating the physiological processes of maturation of the vocal tract structures and brain regions associated with speech and language, and basic electroencephalogram (EEG) rhythms, reflecting the age-related dynamics of maturation of brain structures in children aged 4-11 years. The complex method of analysis, including EEG registration, clinical and spectral analysis of EEG; dichotic listening, identifying the profile of functional lateral asymmetry (PFLA), and phonemic hearing of the child; recording, linguistic, and acoustic analysis of child speech; and identification of speech characteristics reflecting the formation of its different levels, was used. Two complementary experimental series were conducted: the correlation between EEG parameters, speech features, dichotic listening, the PFLA, and phonemic hearing of the child in the age dynamics of 4-11 years (first); the specificity of EEG patterns in children at different stages of reading skills formation (second). The result of this study showed the correlation between acoustic and linguistic features of child speech and brain activity. The analysis of EEG and acoustic features of child speech revealed the correlation between pitch and pitch range values in spontaneous speech and thetarhythm intensity in EEG. High values of pitch and its variation in younger children (4-6 years) are related to the intensity of theta rhythm in the EEG pattern, as this rhythm is most expressed in younger children. It was revealed that the alpha rhythm is asymmetrically localized in children with clear pronunciation of words (which determines the intelligibility of their speech) that is typical for 6.5-to 11-year-old children. The formation of reading skills in a child is associated with a change in the characteristics of the alpha rhythm-from irregular, unstable, low frequency, and low amplitude in children at the beginning of reading skills mastering to medium and low amplitude, regular, asymmetrically localized in children reading words and phrases. The specifics of the relation between brain activity and different levels of speech formation at different child's age periods are discussed.
With the rapid development of speech assistants, adapting server-intended automatic speech recognition (ASR) solutions to a direct device has become crucial. For on-device speech recognition tasks, researchers and industry prefer end-to-end ASR systems as they can be made resource-efficient while maintaining a higher quality compared to hybrid systems. However, building end-to-end models requires a significant amount of speech data. Personalization, which is mainly handling out-of-vocabulary (OOV) words, is another challenging task associated with speech assistants. In this work, we consider building an effective end-to-end ASR system in low-resource setups with a high OOV rate, embodied in Babel Turkish and Babel Georgian tasks. We propose a method of dynamic acoustic unit augmentation based on the Byte Pair Encoding with dropout (BPE-dropout) technique. The method non-deterministically tokenizes utterances to extend the token’s contexts and to regularize their distribution for the model’s recognition of unseen words. It also reduces the need for optimal subword vocabulary size search. The technique provides a steady improvement in regular and personalized (OOV-oriented) speech recognition tasks (at least 6% relative word error rate (WER) and 25% relative F-score) at no additional computational cost. Owing to the BPE-dropout use, our monolingual Turkish Conformer has achieved a competitive result with 22.2% character error rate (CER) and 38.9% WER, which is close to the best published multilingual system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.