2020
DOI: 10.48550/arxiv.2009.07238
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Lessons Learned from Applying off-the-shelf BERT: There is no Silver Bullet

Victor Makarenkov,
Lior Rokach

Abstract: One of the challenges in the NLP field is training large classification models, a task that is both difficult and tedious. It is even harder when GPU hardware is unavailable. The increased availability of pre-trained and off-theshelf word embeddings, models, and modules aim at easing the process of training large models and achieving a competitive performance.We explore the use of off-the-shelf BERT models and share the results of our experiments and compare their results to those of LSTM networks and more sim… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 10 publications
(14 reference statements)
0
2
0
Order By: Relevance
“…Many studies pre-trained BERT models with biomedical literature (Lee et al, 2020;Beltagy et al, 2019) or clinical notes (Alsentzer et al, 2019;Peng et al, 2019; to develop the domain-specific language model, and these studies showed that domain-specific models generally outperform off-the-shelf models in varied clinical NLP tasks, such as clinical NER (Yang et al, 2020b;Greenspan et al, 2020;, relation extraction , sentence similarity (Peng et al, 2019), negation detection (Lin et al, 2020), and concept normalization . However, for clinical text classification, which generally requires a series of clinical notes as input (e.g., automatic ICD coding, clinical outcome prediction), BERT does not always perform well probably because of its restriction in computational resources and the fixed-length setting (Li and Yu, 2020;Makarenkov and Rokach, 2020;. In keeping more closely with the spirit of Transformers, our work is also built on top of Transformers with an emphasized focus on effective representation of document sequences, such as all of a patient's clinical notes in an inpatient visit.…”
Section: Transformer Models In Clinical Domainmentioning
confidence: 99%
“…Many studies pre-trained BERT models with biomedical literature (Lee et al, 2020;Beltagy et al, 2019) or clinical notes (Alsentzer et al, 2019;Peng et al, 2019; to develop the domain-specific language model, and these studies showed that domain-specific models generally outperform off-the-shelf models in varied clinical NLP tasks, such as clinical NER (Yang et al, 2020b;Greenspan et al, 2020;, relation extraction , sentence similarity (Peng et al, 2019), negation detection (Lin et al, 2020), and concept normalization . However, for clinical text classification, which generally requires a series of clinical notes as input (e.g., automatic ICD coding, clinical outcome prediction), BERT does not always perform well probably because of its restriction in computational resources and the fixed-length setting (Li and Yu, 2020;Makarenkov and Rokach, 2020;. In keeping more closely with the spirit of Transformers, our work is also built on top of Transformers with an emphasized focus on effective representation of document sequences, such as all of a patient's clinical notes in an inpatient visit.…”
Section: Transformer Models In Clinical Domainmentioning
confidence: 99%
“…Similarly, the XED multilingual dataset for emotion detection catering for a total of 32 languages has been evaluated using languagespecific BERT models ( Öhman et al, 2020). Lastly, (Makarenkov and Rokach, 2020) explore several off-the-shelf BERT models, where they show that the complexity and computational cost of BERT does not provide a guarantee for an improved predictive performance for classification tasks. This is especially relevant in cases where small domainspecific datasets are used, which datasets are also imbalanced due to the minority class being underrepresented.…”
Section: Related Workmentioning
confidence: 99%