“…Pre-trained language models have achieved remarkable improvements in many NLP tasks, and many variants of PTMs have been proposed. For example, GPT, GPT-2 and GPT-3 (Radford et al, 2018(Radford et al, , 2019Brown et al, 2020), BERT (Devlin et al, 2019), XLNet (Yang et al, 2019) and ALBERT (Lan et al, 2019), ERNIE , BART (Lewis et al, 2020) and RoBERTa (Liu et al, 2019b structure is modified, and knowledge-aware tasks are designed (Zhang et al, 2019;Liu et al, 2020b;Sun et al, 2021;Liu et al, 2020a;Su et al, 2021). For example, ERNIE 3.0 (Sun et al, 2021) appends triples, e.g., (Andersen, Write, Nightingale), ahead of the original input sentence, and designs tasks to predict the relation "Write" in the triple.…”