Raphaël Duroselle scite author profile

Raphaël Duroselle

4Publications

19Citation Statements Received

140Citation Statements Given

How they've been cited

How they cite others

134

Affiliations

École Polytechnique, Institut Polytechnique de Paris

Publications

Order By: Most citations

Modeling and Training Strategies for Language Recognition Systems

Duroselle¹,

Sahidullah²,

Jouvet³

et al. 2021

View full text Add to dashboard Cite

Automatic speech recognition is complementary to language recognition. The language recognition systems exploit this complementarity by using frame-level bottleneck features extracted from neural networks trained with a phone recognition task. Recent methods apply frame-level bottleneck features extracted from an end-to-end sequence-to-sequence speech recognition model. In this work, we study an integrated approach of the training of the speech recognition feature extractor and language recognition modules. We show that for both classical phone recognition and end-to-end sequence-to-sequence features, sequential training of the two modules is not the optimal strategy. The feature extractor can be improved by supervision with the language identification loss, either in a fine-tuning step or in a multi-task training framework. Besides, we notice that end-to-end sequence-to-sequence bottleneck features are on par with classical phone recognition bottleneck features without requiring a forced alignment of the signal with target tokens. However, for sequence-to-sequence, the architecture of the model seems to play an important role; the Conformer architectures leads to much better results than the conventional stacked DNNs approach; and can even be trained directly with the LID module in an end-to-end approach.

show abstract

Unsupervised Regularization of the Embedding Extractor for Robust Language Identification

Duroselle¹,

Jouvet²,

Illina³

2020

View full text Add to dashboard Cite

State-of-the-art spoken language identification systems are constituted of three modules: a frame-level feature extractor, a segment-level embedding extractor and a final classifier. The performance of these systems degrades when facing mismatch between training and testing data. Most domain adaptation methods focus on adaptation of the final classifier. In this article, we propose a model-based unsupervised domain adaptation of the segment-level embedding extractor. The approach consists in a modification of the loss function used for training the embedding extractor. We introduce a regularization term based on the maximum mean discrepancy loss. Experiments were performed on the RATS corpus with transmission channel mismatch between telephone and radio channels. We obtained the same language identification performance as supervised training on the target domains but without using labeled data from these domains.

show abstract

Metric Learning Loss Functions to Reduce Domain Mismatch in the x-Vector Space for Language Recognition

Duroselle¹,

Jouvet²,

Illina³

2020

View full text Add to dashboard Cite

State-of-the-art language recognition systems are based on discriminative embeddings called x-vectors. Channel and gender distortions produce mismatch in such x-vector space where embeddings corresponding to the same language are not grouped in an unique cluster. To control this mismatch, we propose to train the x-vector DNN with metric learning objective functions. Combining a classification loss with the metric learning n-pair loss allows to improve the language recognition performance. Such a system achieves a robustness comparable to a system trained with a domain adaptation loss function but without using the domain information. We also analyze the mismatch due to channel and gender, in comparison to language proximity, in the x-vector space. This is achieved using the Maximum Mean Discrepancy divergence measure between groups of x-vectors. Our analysis shows that using the metric learning loss function reduces gender and channel mismatch in the x-vector space, even for languages only observed on one channel in the train set.

show abstract

SERB, a nano-satellite dedicated to the Earth-Sun relationship

Meftah

Bamas

Cambournac

et al. 2016

View full text Add to dashboard Cite

International audienceThe Solar irradiance and Earth Radiation Budget (SERB) mission is an innovative proof-of-concept nano-satellite, with three ambitious scientiﬁc objectives. The nano-satellite aims at measuring on the same platform the absolute value of the total solar irradiance (TSI) and its variability, the ultraviolet (UV) solar spectral variability, and the diﬀerent components of the Earth radiation budget. SERB is a joint project between CNES (Centre National d'Etudes Spatiales), Ecole polytechnique, and LATMOS (Laboratoire Atmospheres, Milieux, Observations Spatiales) scheduled for a launch in 2020–2021. It is a three-unit CubeSat (X-CubeSat II), developed by students from ´Ecole polytechnique. Critical components of instrumental payloads of future large missions (coatings, UV ﬁlters, etc.) can acquire the technical maturity by ﬂying in a CubeSat. Nano-satellites also represent an excellent alternative for instrumentation testing, allowing for longer ﬂights than rockets. More-over, speciﬁc scientiﬁc experiments can be performed by nano-satellites. This paper is intended to present the SERB mission and its scientiﬁc objectives. © (2016) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.