Mahault Garnerin scite author profile

Mahault Garnerin

6Publications

25Citation Statements Received

57Citation Statements Given

How they've been cited

How they cite others

Affiliations

Grenoble Alpes University

Publications

Order By: Most citations

Gender Representation in French Broadcast Corpora and Its Impact on ASR Performance

Garnerin

Rossato

Besacier

2019

View full text Add to dashboard Cite

This paper analyzes the gender representation in four major corpora of French broadcast. These corpora being widely used within the speech processing community, they are a primary material for training automatic speech recognition (ASR) systems. As gender bias has been highlighted in numerous natural language processing (NLP) applications, we study the impact of the gender imbalance in TV and radio broadcast on the performance of an ASR system. This analysis shows that women are under-represented in our data in terms of speakers and speech turns. We introduce the notion of speaker role to refine our analysis and find that women are even fewer within the Anchor category corresponding to prominent speakers. The disparity of available data for both gender causes performance to decrease on women. However this global trend can be counterbalanced for speaker who are used to speak in the media when sufficient amount of data is available.

show abstract

Investigating the Impact of Gender Representation in ASR Training Data: a Case Study on Librispeech

Garnerin¹,

Rossato²,

Besacier³

2021

View full text Add to dashboard Cite

In this paper we question the impact of gender representation in training data on the performance of an end-to-end ASR system. We create an experiment based on the Librispeech corpus and build 3 different training corpora varying only the proportion of data produced by each gender category. We observe that if our system is overall robust to the gender balance or imbalance in training data, it is nonetheless dependant of the adequacy between the individuals present in the training and testing sets.

show abstract

MaSS: A Large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the Bible

Boito¹,

Havard²,

Garnerin³

et al. 2019

Preprint

View full text Add to dashboard Cite

Gender Representation in Open Source Speech Resources

Garnerin¹,

Rossato²,

Besacier³

2020

Preprint

View full text Add to dashboard Cite

With the rise of artificial intelligence (AI) and the growing use of deep-learning architectures, the question of ethics, transparency and fairness of AI systems has become a central concern within the research community. We address transparency and fairness in spoken language systems by proposing a study about gender representation in speech resources available through the Open Speech and Language Resource platform. We show that finding gender information in open source corpora is not straightforward and that gender balance depends on other corpus characteristics (elicited/non elicited speech, low/high resource language, speech task targeted). The paper ends with recommendations about metadata and gender information for researchers in order to assure better transparency of the speech systems built using such corpora.

show abstract

MaSS - Multilingual corpus of Sentence-aligned Spoken utterances

Boito¹,

Havard²,

Garnerin³

et al. 2019

View full text Add to dashboard Cite

Gender Representation in French Broadcast Corpora and Its Impact on ASR Performance

Garnerin¹,

Rossato²,

Besacier³

2019

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.