2020
DOI: 10.48550/arxiv.2005.10200
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

BERTweet: A pre-trained language model for English Tweets

Abstract: We present BERTweet, the first public largescale pre-trained language model for English Tweets. Our BERTweet is trained using the RoBERTa pre-training procedure (Liu et al., 2019), with the same model configuration as BERT base (Devlin et al., 2019). Experiments show that BERTweet outperforms strong baselines RoBERTa base and XLM-R base (Conneau et al., 2020), producing better performance results than the previous state-of-the-art models on three Tweet NLP tasks: Part-of-speech tagging, Named-entity recognitio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
46
0
1

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 40 publications
(47 citation statements)
references
References 7 publications
0
46
0
1
Order By: Relevance
“…These experiments are purely academic, and TwHIN is not currently being applied to detecting offensive content at Twitter. For our experimental purposes, we construct a baseline approach that fine-tunes a large-scale language model for offensive content detection using linear probing and binary categorical loss; we compare the performance of RoBERTa [24] and BERTweet [28] language model, the latter of which has been pretrained on Twitter-domain data. We evaluate on two collections of tweets where some tweets have been labeled "offensive" or violating guidelines.…”
Section: Recommendation and Predictionmentioning
confidence: 99%
“…These experiments are purely academic, and TwHIN is not currently being applied to detecting offensive content at Twitter. For our experimental purposes, we construct a baseline approach that fine-tunes a large-scale language model for offensive content detection using linear probing and binary categorical loss; we compare the performance of RoBERTa [24] and BERTweet [28] language model, the latter of which has been pretrained on Twitter-domain data. We evaluate on two collections of tweets where some tweets have been labeled "offensive" or violating guidelines.…”
Section: Recommendation and Predictionmentioning
confidence: 99%
“…Several works followed BERT, proposing variations using more targeted data. One example is BERTweet [25], in which the authors propose an extension to deal with tweets (short messages from Twitter).…”
Section: Unsupervised Text Analysismentioning
confidence: 99%
“…For person ReID, the backbones are well-known Deep Convolutional Neural Network (DCNN) architectures: ResNet50 [36], OSNet [37], and DenseNet121 [38], all of them previously trained over the ImageNet dataset [5]. For authorship verification, we consider BERT [24], BERTweet [25], and T5 [26] architectures.…”
Section: B Implementation Detailsmentioning
confidence: 99%
“…Besides, some other language-specific BERTs models developed over time for monolingual outperformed multilingual model mBERT: AraBERT (Arabic) [18], AlBERTo (Italian) [115], FinBERT (Finnish) [19], CamemBERT(French) [83], Flaubert (French [76]), BERT-CRF (Portuguese) [137], BERTje (Dutch) [141], RuBERT (Russian) [74] and BERTtweet (A pre-trained language model for English Tweets) [97]. However, to best of our knowledge, not every model has yet been tested for HS domain except AraBERT [12] [38] and AlBERTo [116] which shown better performance for HS detection.…”
Section: Overview Of Deep-learning Recordsmentioning
confidence: 99%