“…In the world of speech recognition, training a single recognizer for multiple languages is not a thematic stranger [3] from Hidden Markov Model (HMM) based models [17,18], hybrid models [19] to end-to-end neural based models with CTC [20,21] or sequence-to-sequence models [22,5,23,24,25,26], with the last approach being inspired by the success of multilingual machine translation [1,2]. The literature especially mentions the merits of disclosing the language identity (when the utterance is supposed to belong to a single language) to the model, whose architecture is designed to incorporate the language information.…”