“…For NER, PhoBERT large produces 1.1 points higher F 1 than PhoBERT base . In addition, PhoBERT base obtains 2+ points higher than the previous SOTA feature-and neural network-based models VnCoreNLP-NER and BiLSTM-CNN-CRF (Ma and Hovy, 2016) [ ] 88.6 BiLSTM-max (Conneau et al, 2018) 66.4 VNER (Nguyen et al, 2019b) 89.6 mBiLSTM (Artetxe and Schwenk, 2019) 72.0 BiLSTM-CNN-CRF + ETNLP [♠] 91.1 multilingual BERT (Devlin et al, 2019) [ ] 69.5 VnCoreNLP-NER + ETNLP [♠] 91.3 XLM MLM+TLM (Conneau and Lample, 2019) 76.6 XLM-R base (our result) 92.0 XLM-R base (Conneau et al, 2020) 75.4 XLM-R large (our result) 92.8 XLM-R large (Conneau et al, 2020) 79.7 PhoBERT base 93.6 PhoBERT base 78.5 PhoBERT large 94.7 PhoBERT large 80.0 are trained with the set of 15K BERT-based ETNLP word embeddings (Vu et al, 2019).…”