Proceedings of the 1st International Workshop on AI for Smart TV Content Production, Access and Delivery 2019
DOI: 10.1145/3347449.3357480
|View full text |Cite
|
Sign up to set email alerts
|

Gender Representation in French Broadcast Corpora and Its Impact on ASR Performance

Abstract: This paper analyzes the gender representation in four major corpora of French broadcast. These corpora being widely used within the speech processing community, they are a primary material for training automatic speech recognition (ASR) systems. As gender bias has been highlighted in numerous natural language processing (NLP) applications, we study the impact of the gender imbalance in TV and radio broadcast on the performance of an ASR system. This analysis shows that women are under-represented in our data i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
9
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 25 publications
(10 citation statements)
references
References 19 publications
0
9
0
1
Order By: Relevance
“…Pioneer work from (Adda-Decker and Lamel, 2005) found better performance on women's voices, while a preliminary research on YouTube automatic caption system found better recognition rate of male speech but no gender-difference in a follow-up study (Tatman and Kasten, 2017). Recent work on hybrid ASR systems observed that gender imbalance in data could lead to decreased ASR performance on the gender category least represented (Garnerin et al, 2019). This last study was conducted on French broadcast data in which women account for only 35% of the speakers.…”
Section: Introductionmentioning
confidence: 99%
“…Pioneer work from (Adda-Decker and Lamel, 2005) found better performance on women's voices, while a preliminary research on YouTube automatic caption system found better recognition rate of male speech but no gender-difference in a follow-up study (Tatman and Kasten, 2017). Recent work on hybrid ASR systems observed that gender imbalance in data could lead to decreased ASR performance on the gender category least represented (Garnerin et al, 2019). This last study was conducted on French broadcast data in which women account for only 35% of the speakers.…”
Section: Introductionmentioning
confidence: 99%
“…Pioneer work from (Adda-Decker and Lamel, 2005) found better performance on women's voices, while a preliminary research on YouTube automatic caption system found better recognition rate of male speech (Tatman, 2017) but no gender-difference in a follow-up study (Tatman and Kasten, 2017). Recent work on hybrid ASR systems observed that gender imbalance in data could lead to decreased ASR performance on the gender category least represented (Garnerin et al, 2019). This last study was conducted on French broadcast data in which women account for only 35% of the speakers.…”
Section: Introductionmentioning
confidence: 99%
“…For instance, ASR systems have been shown to struggle with speech variance due to gender, age, speech impairment, race, and accents. Several studies on different languages have found gender differences: although most studies report that female speech is recognised better than male speech (Arabic [1], English [2][3][4], and French [3]), the reverse pattern is also found (French [5], English [6]), although no difference in the recognition of male and female speech was found in a follow-up study of the latter study [7] nor was a difference found in [5]. [1] found that speakers younger than 30 years of age were better recognised than those older than 30 years.…”
Section: Introductionmentioning
confidence: 99%