2022
DOI: 10.1093/database/baac066
|View full text |Cite
|
Sign up to set email alerts
|

Pre-trained models, data augmentation, and ensemble learning for biomedical information extraction and document classification

Abstract: Large volumes of publications are being produced in biomedical sciences nowadays with ever-increasing speed. To deal with the large amount of unstructured text data, effective natural language processing (NLP) methods need to be developed for various tasks such as document classification and information extraction. BioCreative Challenge was established to evaluate the effectiveness of information extraction methods in biomedical domain and facilitate their development as a community-wide effort. In this paper,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 27 publications
(12 reference statements)
0
2
0
Order By: Relevance
“…Santos et al 87 used a KG to interpret clinical proteomics data for drug target identification and drug repurposing. Erdengasileng et al 88 proposed an approach to identify potential drug-drug interactions with high accuracy. Zhang et al 89 developed MatchMixeR, a cross-platform normalization method for gene expression data integration to identify new drug targets and potential drug combinations.…”
Section: Nlp-enabled Knowledge Graphsmentioning
confidence: 99%
See 1 more Smart Citation
“…Santos et al 87 used a KG to interpret clinical proteomics data for drug target identification and drug repurposing. Erdengasileng et al 88 proposed an approach to identify potential drug-drug interactions with high accuracy. Zhang et al 89 developed MatchMixeR, a cross-platform normalization method for gene expression data integration to identify new drug targets and potential drug combinations.…”
Section: Nlp-enabled Knowledge Graphsmentioning
confidence: 99%
“…used a KG to interpret clinical proteomics data for drug target identification and drug repurposing. Erdengasileng et al 88 . proposed an approach to identify potential drug–drug interactions with high accuracy.…”
Section: Part 3: Nlp In Quantitative Pharmacology Modelingmentioning
confidence: 99%
“…For this purpose, BERT, BioBERT, SciBERT, Pub-MedBERT, and BioMedRoBERTa are fine-tuned using two different classifiers, a linear layer and a Bidirectional Long Short Term Memory (Bi-LSTM) layer, to detect biomedical event triggers. These BERT variants have been chosen for comparison because they share the same BERT architecture but have previously been pretrained using different data in the biomedical and/or general domain [7]- [9]. The models are learned using seven manually annotated data sets merged together.…”
Section: Introductionmentioning
confidence: 99%