Abstract:ObjectivesTo adapt and evaluate a deep learning language model for answering why-questions based on patient-specific clinical text.
Materials and MethodsBidirectional encoder representations from transformers (BERT) models were trained with varying data sources to perform SQuAD 2.0 style why-question answering (why-QA) on clinical notes. The evaluation focused on: 1) comparing the merits from different training data, 2) error analysis.
ResultsThe best model achieved an accuracy of 0.707 (or 0.760 by partial ma… Show more
“…Transformer-based models have been wildly successful in setting state-of-the-art benchmarks on a broad range of natural language processing (NLP) tasks, including question answering, document classification, machine translation, text summarization, and others [1][2][3]. These successes have been replicated in the clinical and biomedical domain via pre-training language models using large-scale clinical or biomedical corpora, then fine-tuning on a variety of clinical or biomedical downstream tasks, including computational phenotyping [4], automatical ICD coding [5], knowledge graph completion [6] and clinical question answering [7].…”
Objective
Clinical knowledge-enriched transformer models (eg, ClinicalBERT) have state-of-the-art results on clinical natural language processing (NLP) tasks. One of the core limitations of these transformer models is the substantial memory consumption due to their full self-attention mechanism, which leads to the performance degradation in long clinical texts. To overcome this, we propose to leverage long-sequence transformer models (eg, Longformer and BigBird), which extend the maximum input sequence length from 512 to 4096, to enhance the ability to model long-term dependencies in long clinical texts.
Materials and methods
Inspired by the success of long-sequence transformer models and the fact that clinical notes are mostly long, we introduce 2 domain-enriched language models, Clinical-Longformer and Clinical-BigBird, which are pretrained on a large-scale clinical corpus. We evaluate both language models using 10 baseline tasks including named entity recognition, question answering, natural language inference, and document classification tasks.
Results
The results demonstrate that Clinical-Longformer and Clinical-BigBird consistently and significantly outperform ClinicalBERT and other short-sequence transformers in all 10 downstream tasks and achieve new state-of-the-art results.
Discussion
Our pretrained language models provide the bedrock for clinical NLP using long texts. We have made our source code available at https://github.com/luoyuanlab/Clinical-Longformer, and the pretrained models available for public download at: https://huggingface.co/yikuan8/Clinical-Longformer.
Conclusion
This study demonstrates that clinical knowledge-enriched long-sequence transformers are able to learn long-term dependencies in long clinical text. Our methods can also inspire the development of other domain-enriched long-sequence transformers.
“…Transformer-based models have been wildly successful in setting state-of-the-art benchmarks on a broad range of natural language processing (NLP) tasks, including question answering, document classification, machine translation, text summarization, and others [1][2][3]. These successes have been replicated in the clinical and biomedical domain via pre-training language models using large-scale clinical or biomedical corpora, then fine-tuning on a variety of clinical or biomedical downstream tasks, including computational phenotyping [4], automatical ICD coding [5], knowledge graph completion [6] and clinical question answering [7].…”
Objective
Clinical knowledge-enriched transformer models (eg, ClinicalBERT) have state-of-the-art results on clinical natural language processing (NLP) tasks. One of the core limitations of these transformer models is the substantial memory consumption due to their full self-attention mechanism, which leads to the performance degradation in long clinical texts. To overcome this, we propose to leverage long-sequence transformer models (eg, Longformer and BigBird), which extend the maximum input sequence length from 512 to 4096, to enhance the ability to model long-term dependencies in long clinical texts.
Materials and methods
Inspired by the success of long-sequence transformer models and the fact that clinical notes are mostly long, we introduce 2 domain-enriched language models, Clinical-Longformer and Clinical-BigBird, which are pretrained on a large-scale clinical corpus. We evaluate both language models using 10 baseline tasks including named entity recognition, question answering, natural language inference, and document classification tasks.
Results
The results demonstrate that Clinical-Longformer and Clinical-BigBird consistently and significantly outperform ClinicalBERT and other short-sequence transformers in all 10 downstream tasks and achieve new state-of-the-art results.
Discussion
Our pretrained language models provide the bedrock for clinical NLP using long texts. We have made our source code available at https://github.com/luoyuanlab/Clinical-Longformer, and the pretrained models available for public download at: https://huggingface.co/yikuan8/Clinical-Longformer.
Conclusion
This study demonstrates that clinical knowledge-enriched long-sequence transformers are able to learn long-term dependencies in long clinical text. Our methods can also inspire the development of other domain-enriched long-sequence transformers.
“…The size of the token dictionary used in the test used approximately 30,000 sub tokens in both the SentencePiece and ByteLevelBPE methods. The test was carried out with the learning rates of 1e-5 and 5e-5 for both languages according to the reference from the BERT paper [17], [22], [26].…”
Section: Methodsmentioning
confidence: 99%
“…In this study, we proposed RoBERTa − an artificial neural network based on transformers with 6 layers (base model). RoBERTa [22], [23] used the WordPiece tokenization technique in the pre-training stage and the training method used masked layer modelling (MLM) and NSP to support our QAS system. Our contribution are: 1) representing the extraction of language models at the character level for answer selection without any engineering features and linguistic tools; and 2) applying an efficient self-attention model to generate answers according to context by calculating the input and output representations regardless of word order.…”
This research aimed to evaluate the performance of the A Lite BERT (ALBERT), efficiently learning an encoder that classifies token replacements accurately (ELECTRA) and a robust optimized BERT pretraining approach (RoBERTa) models to support the development of the Indonesian language question and answer system model. The evaluation carried out used Indonesian, Malay and Esperanto. Here, Esperanto was used as a comparison of Indonesian because it is international, which does not belong to any person or country and this then make it neutral. Compared to other foreign languages, the structure and construction of Esperanto is relatively simple. The dataset used was the result of crawling Wikipedia for Indonesian and Open Super-large Crawled ALMAnaCH coRpus (OSCAR) for Esperanto. The size of the token dictionary used in the test used approximately 30,000 sub tokens in both the SentencePiece and byte-level byte pair encoding methods (ByteLevelBPE). The test was carried out with the learning rates of 1e-5 and 5e-5 for both languages in accordance with the reference from the bidirectional encoder representations from transformers (BERT) paper. As shown in the final result of this study, the ALBERT and RoBERTa models in Esperanto showed the results of the loss calculation that were not much different. This showed that the RoBERTa model was better to implement an Indonesian question and answer system.
“…In this study, we focus on clinical reading comprehension task, which aims to extract a text span (a sentence or multiple sentences) as the answer from a patient clinical note given a question (Yue et al, 2020). Though many neural models (Seo et al, 2017;Rawat et al, 2020;Wen et al, 2020) have achieved impressive results on this task, their performance on new clinical contexts, whose data distributions could be different from the ones that these models were trained on, is still far from satisfactory (Yue et al, 2020). Though one can improve the performance by adding more QA pairs on new contexts into training, however, manually creating large-scale QA pairs in the clinical domain often involves tremendous expert effort and data privacy concerns.…”
Clinical question answering (QA) aims to automatically answer questions from medical professionals based on clinical texts. Studies show that neural QA models trained on one corpus may not generalize well to new clinical texts from a different institute or a different patient group, where large-scale QA pairs are not readily available for retraining. To address this challenge, we propose a simple yet effective framework, CliniQG4QA, which leverages question generation (QG) to synthesize QA pairs on new clinical contexts and boosts QA models without requiring manual annotations. In order to generate diverse types of questions that are essential for training QA models, we further introduce a seq2seq-based question phrase prediction (QPP) module that can be used together with most existing QG models to diversify their generation. Our comprehensive experiment results show that the QA corpus generated by our framework is helpful to improve QA models on the new contexts (up to 8% absolute gain in terms of Exact Match), and that the QPP module plays a crucial role in achieving the gain. 1
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.