2021
DOI: 10.1007/978-3-030-84186-7_31
|View full text |Cite
|
Sign up to set email alerts
|

A Robustly Optimized BERT Pre-training Approach with Post-training

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

7
2,088
2
7

Year Published

2021
2021
2023
2023

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 1,669 publications
(2,534 citation statements)
references
References 17 publications
7
2,088
2
7
Order By: Relevance
“…Finally, although RoBERTa (Liu et al, 2019) has exhibited improvements over BERT on many different tasks, we found that, in this case, using pretrained RoBERTa instead of BERT does not yield much improvement. The predictions of the two models are highly correlated, with 0.95 correlation over all datasets' predictions.…”
Section: Resultsmentioning
confidence: 65%
“…Finally, although RoBERTa (Liu et al, 2019) has exhibited improvements over BERT on many different tasks, we found that, in this case, using pretrained RoBERTa instead of BERT does not yield much improvement. The predictions of the two models are highly correlated, with 0.95 correlation over all datasets' predictions.…”
Section: Resultsmentioning
confidence: 65%
“…The embedding of [CLS] is mainly learned from NSP. However, a recent study shows that NSP does not contribute much to the sentence representation learning [33]. SBERT-WK can make use of the existing semantics in BERT as much as possible, but it cannot increase the semantics in BERT.…”
Section: Related Workmentioning
confidence: 97%
“…Representative autoregressive language models are word2vec (Mikolov et al , 2013), Glove (Pennington et al , 2014), ELMO (Peters et al , 2018), GPT (Radford et al , 2018), GPT-2 (Radford et al , 2019) and XLNet (Yang et al , 2019), and they are more suitable for text generation task. Representative autoencoding language models are Bert (Devlin et al , 2018), Bert-wwm (Cui et al , 2019), RoBERTa (Liu et al , 2019), ALBERT (Lan et al , 2019), ERNIE (Sun et al , 2019a), ERNIE-2 (Sun et al , 2019b) and ELECTRA (Clark et al , 2020), and they are more suitable for entity and relation extraction.…”
Section: Related Workmentioning
confidence: 99%