“…For example, RoBERTa (Liu et al, 2019), CamemBERT (Martin et al, 2020) and GELECTRA (Chan et al, 2020) achieve near human performances on SQuAD (Rajpurkar et al, 2018), FQuAD (d'Hoffschmidt et al, 2020Heinrich et al, 2021) and GermanQuAD (Möller et al, 2021), respectively. However, for other low-resource languages, such as Vietnamese, the performances of pre-trained language models are significant far lower than that of humans (Nguyen et al, 2022). We can explain these difficulties in research by the underdevelopment of Vietnamese monolingual language models.…”