Using Character-Level and Entity-Level Representations to Enhance Bidirectional Encoder Representation From Transformers-Based Clinical Semantic Textual Similarity Model: ClinicalSTS Modeling Study

Xiong, Ying; Chen, Shuai; Chen, Qingcai; Yan, Jun; Tang, Buzhou

doi:10.2196/23357

Cited by 9 publications

(4 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It has often been reported that BERT exhibits high performance, even with clinical text [36][37][38][39]. This is also true for this study, in which a model combining BERT and Bi-LSTM using clinical text recorded in daily practice allowed for fall prediction with an accuracy equal to or higher than that of conventional risk assessment tools.…”

Section: Fall Prediction Model Performancesupporting

confidence: 66%

Impact of a Clinical Text–Based Fall Prediction Model on Preventing Extended Hospital Stays for Elderly Inpatients: Model Development and Performance Evaluation

Kawazoe¹,

Shimamoto²,

Shibata³

et al. 2022

JMIR Med Inform

View full text Add to dashboard Cite

Background Falls may cause elderly people to be bedridden, requiring professional intervention; thus, fall prevention is crucial. The use of electronic health records (EHRs) is expected to provide highly accurate risk assessment and length-of-stay data related to falls, which may be used to estimate the costs and benefits of prevention. However, no studies to date have investigated the extent to which hospital stays could be shortened through fall avoidance resulting from the use of prediction tools. Objective We first estimated the extended length of hospital stay caused by falls among elderly inpatients. Next, we developed a model that predicts falls using clinical text as input and evaluated its accuracy. Finally, we estimated the potentially shortened hospital stay that would be made possible by appropriate interventions based on the prediction model. Methods Patients aged 65 years or older were selected as subjects, and the EHRs of 1728 falls and 70,586 nonfalls were subjected to analysis. The extended-stay lengths were estimated using propensity score matching of 49 associated variables. Bidirectional encoder representations from transformers and bidirectional long short-term memory methods were used to predict falls from clinical text. The estimated length of stay and the outputs of the prediction model were used to determine stay reductions. Results The extended length of hospital stay due to falls was estimated to be 17.8 days (95% CI 16.6-19.0), which dropped to 8.6 days when there were unobserved covariates at an odds ratio of 2.0. The accuracy of the prediction model was as follows: area under the receiver operating characteristic curve, 0.851; F-value, 0.165; recall, 0.737; precision, 0.093; and specificity, 0.839. When assuming interventions with 25% or 100% effectiveness against cases where the model predicted a fall, the stay reduction was estimated at 0.022 and 0.099 days/day, respectively. Conclusions The accuracy of the prediction model using clinical text is considered to be higher than the prediction accuracy of conventional assessments. However, our model’s precision remained low at 9.3%. This may be due, in part, to the inclusion of cases in which falls did not occur because of preventative interventions during hospitalization. Nonetheless, it is estimated that interventions for cases when falls were predicted will reduce medical costs by 886 Yen/day (~US $6.50/day) of intervention, even if the preventative effect is 25%. Limitations include the fact that these results cannot be extrapolated to short- or long-term hospitalization cases, and that this was a single-center study.

show abstract

Section: Fall Prediction Model Performancesupporting

confidence: 66%

Impact of a Clinical Text–Based Fall Prediction Model on Preventing Extended Hospital Stays for Elderly Inpatients: Model Development and Performance Evaluation

Kawazoe¹,

Shimamoto²,

Shibata³

et al. 2022

JMIR Med Inform

View full text Add to dashboard Cite

show abstract

“…We compare our results to existing methods and conduct ablation studies. Model N2C2STS (Xiong et al, 2020a) 0.868 (Ormerod et al, 2021) 0.870 (Chen et al, 2021) (single) 0.87 (Mulyar et al, 2021) 0.867 (Wang et al, 2022b) 0.875 EARA (BlueBERT) 0.887 across three benchmark datasets: N2C2STS, EBM-SASS, and BIOSSES. Notably, our approach consistently outperforms the baseline models, with average improvements ranging from 1.94% to 4.22%.…”

Section: Resultsmentioning

confidence: 99%

“…The first used data augmentation strategies (Wang et al, 2020c;Li et al, 2021a) or multi-task learning (Mulyar et al, 2021;Mahajan et al, 2020) to enhance the model's representation. The second introduced external knowledge into the neural network models, which can capture implicit information (Xiong et al, 2020a;Chang et al, 2021). These methods only integrate traditional features and lack interpretations.…”

Section: Related Workmentioning

confidence: 99%

“…In this section, we give a brief introduction of datasets on which we evaluate all models. (Xiong et al, 2020a), we train our supervised model on the training set using a 5-fold crossvalidation and evaluate our model on the gold test set. All experimental results are evaluated by the PCC, which measures the linear correlation between two sets of data.…”

Section: Benchmarks and Metricsmentioning

confidence: 99%

See 1 more Smart Citation

EARA: Improving Biomedical Semantic Textual Similarity with Entity-Aligned Attention and Retrieval Augmentation

Xiong,

Yang,

Liu

et al. 2023

Findings of the Association for Computational Linguistics: EMNLP 2023

View full text Add to dashboard Cite

Measuring Semantic Textual Similarity (STS) is a fundamental task in biomedical text processing, which aims at quantifying the similarity between two input biomedical sentences. Unfortunately, the STS datasets in the biomedical domain are relatively smaller but more complex in semantics than common domain, often leading to overfitting issues and insufficient text representation even based on Pre-trained Language Models (PLMs) due to too many biomedical entities. In this paper, we propose EARA, an entity-aligned, attention-based and retrievalaugmented PLMs. Our proposed EARA first aligns the same type of fine-grained entity information in each sentence pair with an entity alignment matrix. Then, EARA regularizes the attention mechanism with an entity alignment matrix with an auxiliary loss. Finally, we add a retrieval module that retrieves similar instances to expand the scope of entity pairs and improve the model's generalization. The comprehensive experiments reflect that EARA can achieve state-of-the-art performance on both in-domain and out-of-domain datasets. Source code is available 1 .

show abstract

Language model and its interpretability in biomedicine: A scoping review

Lyu,

Wang,

Chen

et al. 2024

iScience

View full text Add to dashboard Cite

Using Character-Level and Entity-Level Representations to Enhance Bidirectional Encoder Representation From Transformers-Based Clinical Semantic Textual Similarity Model: ClinicalSTS Modeling Study

Cited by 9 publications

References 35 publications

Impact of a Clinical Text–Based Fall Prediction Model on Preventing Extended Hospital Stays for Elderly Inpatients: Model Development and Performance Evaluation

Impact of a Clinical Text–Based Fall Prediction Model on Preventing Extended Hospital Stays for Elderly Inpatients: Model Development and Performance Evaluation

EARA: Improving Biomedical Semantic Textual Similarity with Entity-Aligned Attention and Retrieval Augmentation

Language model and its interpretability in biomedicine: A scoping review

Contact Info

Product

Resources

About