The primary goal of this FBK's systems submission to the IWSLT 2022 offline and simultaneous speech translation tasks is to reduce model training costs without sacrificing translation quality. As such, we first question the need of ASR pre-training, showing that it is not essential to achieve competitive results. Second, we focus on data filtering, showing that a simple method that looks at the ratio between source and target characters yields a quality improvement of 1 BLEU. Third, we compare different methods to reduce the detrimental effect of the audio segmentation mismatch between training data manually segmented at sentence level and inference data that is automatically segmented. Towards the same goal of training cost reduction, we participate in the simultaneous task with the same model trained for offline ST. The effectiveness of our lightweight training strategy is shown by the high score obtained on the MuST-C ende corpus (26.7 BLEU) and is confirmed in high-resource data conditions by a 1.6 BLEU improvement on the IWSLT2020 test set over last year's winning system.
Gender inclusivity in language has become a central topic of debate and research. Its application in the cross-lingual contexts of human and machine translation (MT), however, remains largely unexplored. Here, we discuss Gender-Neutral Translation (GNT) as a form of gender inclusivity in translation and advocate for its adoption for MT models, which have been found to perpetuate gender bias and discrimination. To this aim, we review a selection of relevant institutional guidelines for Gender-Inclusive Language (GIL) to collect and systematize useful strategies of gender neutralization. Then, we discuss GNT and its scenarios of use, devising a list of desiderata. Finally, we identify the main technical challenges to the implementation of GNT in MT. Throughout these contributions we focus on translation from English into Italian, as representative of salient linguistic transfer problems, due to the different rules for gender marking in their grammar. 1 See https://www.cbsnews.com/amp/news/ teacher-jailed-for-contempt-of-courtrefusing-student-gender-neutralpronouns-ireland/.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.