2022
DOI: 10.3390/app12115720
|View full text |Cite
|
Sign up to set email alerts
|

BERT Models for Arabic Text Classification: A Systematic Review

Abstract: Bidirectional Encoder Representations from Transformers (BERT) has gained increasing attention from researchers and practitioners as it has proven to be an invaluable technique in natural languages processing. This is mainly due to its unique features, including its ability to predict words conditioned on both the left and the right context, and its ability to be pretrained using the plain text corpus that is enormously available on the web. As BERT gained more interest, more BERT models were introduced to sup… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
24
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 50 publications
(24 citation statements)
references
References 67 publications
(104 reference statements)
0
24
0
Order By: Relevance
“…The least-performing transformer-based method is XLM-R. A plausible explanation is that XLM-R is pretrained using multi-lingual data and usually outperformed by the monolingual models pretrained with large language-specific datasets and rich vocabularies ( Virtanen et al, 2019 ; Alammary, 2022 ). MARBERT achieved comparable performance to AraBERT in the micro-averaged F1-score but suffered from a performance gap in terms of the macro-averaged F-scores, although MARBERT is pretrained with larger data.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The least-performing transformer-based method is XLM-R. A plausible explanation is that XLM-R is pretrained using multi-lingual data and usually outperformed by the monolingual models pretrained with large language-specific datasets and rich vocabularies ( Virtanen et al, 2019 ; Alammary, 2022 ). MARBERT achieved comparable performance to AraBERT in the micro-averaged F1-score but suffered from a performance gap in terms of the macro-averaged F-scores, although MARBERT is pretrained with larger data.…”
Section: Resultsmentioning
confidence: 99%
“…MARBERT achieved comparable performance to AraBERT in the micro-averaged F1-score but suffered from a performance gap in terms of the macro-averaged F-scores, although MARBERT is pretrained with larger data. A systematic review of BERT Models for various Arabic text classification problems ( Alammary, 2022 ) shows that AraBERT outperformed MARBERT in several tasks and vice versa . It also shows that a large pretraining corpus does not necessarily guarantee better performance.…”
Section: Resultsmentioning
confidence: 99%
“…Given the effectiveness of transformer-based models, there have been various transformer models used in Arabic sentiment analysis. The widely utilized models are Multilingual BERT, AraBERT, and MARBERT [9]. The author in [10] addressed sentiment analysis in Modern Standard Arabic (MSA) and other Arabic dialects such as Levantine, Egyptian, and Gulf.…”
Section: Related Workmentioning
confidence: 99%
“…Like others, the region was rife with rumours and fake news. Advancing a classifier mechanism for Arabic linguistic needs understanding of syntactic framework of words so it could represent and manipulate the words for making their categorization very accurate [14]. The research into Arabic text classifiers can be confined when compared with the research volume on English textual classifiers.…”
Section: Introductionmentioning
confidence: 99%