2020
DOI: 10.1007/978-3-030-43887-6_58
|View full text |Cite
|
Sign up to set email alerts
|

How to Pre-train Your Model? Comparison of Different Pre-training Models for Biomedical Question Answering

Abstract: Using deep learning models on small scale datasets would result in overfitting. To overcome this problem, the process of pre-training a model and fine-tuning it to the small scale dataset has been used extensively in domains such as image processing. Similarly for question answering, pre-training and finetuning can be done in several ways. Commonly reading comprehension models are used for pre-training, but we show that other types of pre-training can work better. We compare two pre-training models based on re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 17 publications
0
5
0
Order By: Relevance
“…However, WordPiece vocabulary needs to be constructed based on biomedical corpora. The reasons for the use of the original BERT base vocabulary are as follows: first to make BioBERT compatible with the BERT model, which helps BERT to be reused pretrained on general domain corpora, making current BERT simpler to use; and second is that any new terms for the biomedical domain can be interpreted and fine-tuned using the original BERT WordPiece vocabulary (Kamath et al , 2020).…”
Section: Methodsmentioning
confidence: 99%
“…However, WordPiece vocabulary needs to be constructed based on biomedical corpora. The reasons for the use of the original BERT base vocabulary are as follows: first to make BioBERT compatible with the BERT model, which helps BERT to be reused pretrained on general domain corpora, making current BERT simpler to use; and second is that any new terms for the biomedical domain can be interpreted and fine-tuned using the original BERT WordPiece vocabulary (Kamath et al , 2020).…”
Section: Methodsmentioning
confidence: 99%
“…PubMedBERT [66], BioMegatron [215], Yoon et al [284], Jeong et al [89], Chakraborty et al [30], Kamath et al [100], Du et al [52], Yoon et al [283], Zhou et al [300], Akdemir et al [5], He et al [78], Amherst et al [200], Kommaraju et al [112], for COVID-19 [55,120,170,201], Soni et al [222], Mairittha et al [147]. Dialogue Systems Zeng at al.…”
Section: Question Answeringmentioning
confidence: 99%
“…BioMedBERT is based on the BERT model pre-trained on a large-scale biomedical literature dataset BREATHE. Kamath et al [100] compared the effectiveness of pre-trained models for machine-reading comprehension and question-answering in the general domain in fine-tuning the biomedical question answering task. They found that the question answering model fits better to the task.…”
Section: Question Answeringmentioning
confidence: 99%
“…As the language modelling can be seen as independent tasks, some researchers see pre-training with language modelling objective as a part of the transfer learning paradigm [9,6]. Pre-training on natural language understanding tasks, in particular, on sentence modelling tasks, help not only to improve the quality of the task under consideration [2,21,12], but also to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity [19].…”
Section: Related Workmentioning
confidence: 99%