Investigating techniques for low resource conversational speech recognition

Laurent, Antoine; Fraga-Silva, Thiago; Lamel, Lori; Gauvain, Jean‐Luc

doi:10.1109/icassp.2016.7472824

Cited by 3 publications

(4 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…iii. Can code-switching be detected using LID systems [17]? what is the minimum time span to be successful?…”

Section: Research Questionsmentioning

confidence: 99%

“…First, the speech files were automatically segmented into acoustically homogeneous segments, which ideally correspond to speaker turns and/or to a given language or stable acoustic conditions (broad band/telephone band...). These segments were then automatically transcribed using different ASR systems [23,17] in parallel: a French system, a multi-dialect Arabic system (predominantly Lebanese) and an Algerian Arabic (dialect) system. The systems were trained on several hundreds of hours of speech from a large number of speakers.…”

Section: Speech Technologies For Code-switchingmentioning

confidence: 99%

See 1 more Smart Citation

Addressing Code-Switching in French/Algerian Arabic Speech

2017

Self Cite

View full text Add to dashboard Cite

This study focuses on code-switching (CS) in French/Algerian Arabic bilingual communities and investigates how speech technologies, such as automatic data partitioning, language identification and automatic speech recognition (ASR) can serve to analyze and classify this type of bilingual speech. A preliminary study carried out using a corpus of Maghrebian broadcast data revealed a relatively high presence of CS Algerian Arabic as compared to the neighboring countries Morocco and Tunisia. Therefore this study focuses on code switching produced by bilingual Algerian speakers who can be considered native speakers of both Algerian Arabic and French. A specific corpus of four hours of speech from 8 bilingual French Algerian speakers was collected. This corpus contains read speech and conversational speech in both languages and includes stretches of code-switching. We provide a linguistic description of the code-switching stretches in terms of intra-sentential and intersentential switches, the speech duration in each language. We report on some initial studies to locate French, Arabic and the code-switched stretches, using ASR system word posteriors for this pair of languages.

show abstract

“…iii. Can code-switching be detected using LID systems [17]? what is the minimum time span to be successful?…”

Section: Research Questionsmentioning

confidence: 99%

Section: Speech Technologies For Code-switchingmentioning

confidence: 99%

Addressing Code-Switching in French/Algerian Arabic Speech

2017

Self Cite

View full text Add to dashboard Cite

show abstract

“…All of these speech modifications drastically degrade the performance of automatic speech recognition (ASR) systems when the speaker wears an oxygen mask [11]. Using recent speech recognition systems trained with normal speech [12,13,14], the Word Error Rate (WER) obtained for speech with the oxygen mask doubles in comparison to that of normal speech from the same speaker. In order to build accurate ASR models for military aircraft pilots, the speech variations needs to be clearly identified and quantified.…”

Section: Introductionmentioning

confidence: 99%

Modeling the Effect of Military Oxygen Masks on Speech Characteristics

Elie¹,

Gauvain²,

Gauvain³

et al. 2021

Interspeech 2021

Self Cite

View full text Add to dashboard Cite

Wearing an oxygen mask changes the speech production of speakers. It indeed modifies the vocal apparatus and perturbs the articulatory movements of the speaker. This paper studies the impact of the oxygen mask of military aircraft pilots on formant trajectories, both dynamically (variations of the formants at a utterance level) and globally (mean value at the utterance level) for 12 speakers.A comparative analysis of speech collected with and without an oxygen mask shows that the mask has a significant impact on the formant trajectories, both on the mean values and on the formant variations at the utterance level. This impact is strongly dependent on the speaker and also on the mask model. These observations suggest that the articulatory movements of the speaker are modified by the presence of the mask.These observations are validated via a preliminary ASR experiment that uses a data augmentation technique based on articulatory perturbations that are driven by our experimental observations.

show abstract

“…To this end, we propose to compare French vowel production variation in Algerian Arabic-French bilinguals and in French (FR) native speakers. Furthermore, we will compare their French productions (FR-Alg) to their speech productions of Algerian Arabic (AA) in CS context [6,7]. The aim of the study is to shed some light on the pronunciation variation Algerian Arabic-French bilinguals produce in both languages and in CS speech.…”

Section: Introductionmentioning

confidence: 99%

Studying Vowel Variation in French-Algerian Arabic Code-switched Speech

Wottawa¹,

Amazouz²,

Adda-Decker³

et al. 2018

Interspeech 2018

Self Cite

View full text Add to dashboard Cite

Algerian Arabic-French bilinguals show phonetic variation with respect to vowel timber in both their languages. Our study aims to automatically identify vowel variants frequently produced by such bilinguals. To that end, the speech corpus FACST, containing French and Algerian Arabic code-switched speech, was analyzed. A second corpus with native French speakers (NCCFr) was used to provide a reference baseline and to compare vowel variants across the two speaker groups. Three experiments were carried out: first, the French speech of both corpora was aligned with a French acoustic model including parallel nearest-neighbor vowel variants in its pronunciation dictionary. Second, the Arabic speech was aligned using the same acoustic model with parallel vowel variants in its dictionary. Finally, we tested whether peripheral vowels in Algerian Arabic-French bilinguals are more often centralized than in French native speech by allowing schwa as a competing variant. The results show that French natives and Algerian Arabic-French bilinguals globally have a comparable amount of vowel variation in French. However, French natives have stable high vowels whereas bilinguals tend to produce stable low and back vowels. In the centralization experiment, Algerian bilinguals favor the centralization of mid, open and back vowels.

show abstract

Investigating techniques for low resource conversational speech recognition

Cited by 3 publications

References 18 publications

Addressing Code-Switching in French/Algerian Arabic Speech

Addressing Code-Switching in French/Algerian Arabic Speech

Modeling the Effect of Military Oxygen Masks on Speech Characteristics

Studying Vowel Variation in French-Algerian Arabic Code-switched Speech

Contact Info

Product

Resources

About