“…Compared to fully fine-tuned models, adapter models only incorporate a few task-specific parameters for each new task. The BERT-based experiments were conducted on three pretrained language models, each capable of handling Hebrew text, that is, (a) XLM-RoBERTa (Conneau et al, 2019), a multilingual language model based on the RoBERTa architecture (Liu et al, 2019), (b) HeBERT (Chriqui & Yahav, 2021), a monolingual BERT model trained on Hebrew data, and (c) AlephBERT (Seker et al, 2022), another monolingual BERT-based model trained on a large Hebrew vocabulary of 52K tokens optimized via masked-token prediction. Corresponding variants of these BERT models with lightweight adapter solutions focused on a small number of task-specific parameters for training using bottleneck adapters (Houlsby et al, 2019) and mix-and-match (MAM) adapters (He et al, 2021).…”