“…There are two main research paradigms for ST, the end-to-end model, and the cascaded system (Sperber and Paulik, 2020;nie, 2019). End-to-end ST Previous works (Bérard et al, 2016;Duong et al, 2016) have proved the potential for end-to-end ST, which has attracted intensive attentions (Vila et al, 2018;Salesky et al, 2018Salesky et al, , 2019bDi Gangi et al, 2019a;Bahar et al, 2019a;Di Gangi et al, 2019b;Inaguma et al, 2020). It's proved that pre-training (Weiss et al, 2017;Bérard et al, 2018;Bansal et al, 2018;Stoian et al, 2020) and multi-task learning (Vydana et al, 2020) can significantly improve the performance.…”