Publicly Available Clinical

Alsentzer, Emily; Murphy, John R.; Boag, William; Weng, Wei-Hung; Jindi, Di; Naumann, Tristan; McDermott, Matthew B. A.

doi:10.18653/v1/w19-1909

Cited by 1,054 publications

(863 citation statements)

References 26 publications

Supporting

Mentioning

850

Contrasting

Unclassified

Order By: Relevance

“…Contextual word embeddings are often pretrained on a large dataset through self-supervised tasks, then released for fine-tuned use in downstream tasks. This pretraining can be domain specific, such as in the various clinical-text specific BERT models which have been released [3,29,60]. This pretraining task can be another source in which bias present in training text can be hard-coded into a word embedding model.…”

Section: Background and Related Work 21 Contextual Embeddingsmentioning

confidence: 99%

“…Several BERT models pretrained on MIMIC notes are publicly available [3,29,60]. However, to address several limitations, we choose to train our own clinical BERT model in this work.…”

Section: Pretrained Clinical Embeddingsmentioning

confidence: 99%

“…To the extracted BERT representations, we also concatenate age, along with the OASIS, SAPS II and SOFA acuity scores [33,34,37], which account for disease severity at admission. 3 We feed this vector into a fully connected neural network with batchnorm layers, ReLU activations, and ending in a sigmoid activation. A grid search is done over the number of layers, the ratio of the number of neurons in each layer to the previous, and the dropout rate.…”

Section: Downstream Trainingmentioning

confidence: 99%

“…For cases with more than two protected groups, we err on the side of caution and adopt the most aggressive definition of fairness for the healthcare setting. For a particular group, we report the maximum performance gap between said group and all other groups 3. The intuition is that very acute conditions often have more limited treatment options, which should be accounted for in a strong classifier.…”

mentioning

confidence: 99%

See 3 more Smart Citations

Hurtful words

Zhang

Abdalla

et al. 2020

Proceedings of the ACM Conference on Health, Inference, and Learning

Self Cite

View full text Add to dashboard Cite

In this work, we examine the extent to which embeddings may encode marginalized populations differently, and how this may lead to a perpetuation of biases and worsened performance on clinical tasks. We pretrain deep embedding models (BERT) on medical notes from the MIMIC-III hospital dataset, and quantify potential disparities using two approaches. First, we identify dangerous latent relationships that are captured by the contextual word embeddings using a fill-in-the-blank method with text from real clinical notes and a log probability bias score quantification. Second, we evaluate performance gaps across different definitions of fairness on over 50 downstream clinical prediction tasks that include detection of acute and chronic conditions. We find that classifiers trained from BERT representations exhibit statistically significant differences in performance, often favoring the majority group with regards to gender, language, ethnicity, and insurance status. Finally, we explore shortcomings of using adversarial debiasing to obfuscate subgroup information in contextual word embeddings, and recommend best practices for such deep embedding models in clinical settings.

show abstract

Section: Background and Related Work 21 Contextual Embeddingsmentioning

confidence: 99%

“…Several BERT models pretrained on MIMIC notes are publicly available [3,29,60]. However, to address several limitations, we choose to train our own clinical BERT model in this work.…”

Section: Pretrained Clinical Embeddingsmentioning

confidence: 99%

Section: Downstream Trainingmentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

Hurtful words

Zhang

Abdalla

et al. 2020

Proceedings of the ACM Conference on Health, Inference, and Learning

Self Cite

View full text Add to dashboard Cite

show abstract

“…Alsentzer et al used approximately 2 million clinical notes from the MIMIC-III v1.4 database [8] and pre-trained a Clinical BERT model [9]. They made it publicly available; otherwise it originally took about 17 days of computational runtime by a single GeForce GTX TITAN X 12 GB GPU.…”

Section: Clinical Bertmentioning

confidence: 99%

Adapting and evaluating a deep learning language model for clinical why-question answering

et al. 2020

View full text Add to dashboard Cite

ObjectivesTo adapt and evaluate a deep learning language model for answering why-questions based on patient-specific clinical text. Materials and MethodsBidirectional encoder representations from transformers (BERT) models were trained with varying data sources to perform SQuAD 2.0 style why-question answering (why-QA) on clinical notes. The evaluation focused on: 1) comparing the merits from different training data, 2) error analysis. ResultsThe best model achieved an accuracy of 0.707 (or 0.760 by partial match). Training toward customization for the clinical language helped increase 6% in accuracy. DiscussionThe error analysis suggested that the model did not really perform deep reasoning and that clinical why-QA might warrant more sophisticated solutions. ConclusionThe BERT model achieved moderate accuracy in clinical why-QA and should benefit from the rapidly evolving technology. Despite the identified limitations, it could serve as a competent proxy for question-driven clinical information extraction.

show abstract

Machine Learning of Patient Characteristics to Predict Admission Outcomes in the Undiagnosed Diseases Network

Amiri

Kohane

2021

JAMA Netw Open

View full text Add to dashboard Cite

Key Points Question Can machine learning algorithms reproduce the performance of clinical experts in determining whether to accept patients to the Undiagnosed Diseases Network for extensive genome-scale evaluation? Findings This prognostic study developed a machine learning model using 2421 patient applications and evaluated the model through retrospective and prospective validation. The area under the receiver operating characteristic curve obtained for predicting admission outcomes suggested that the admission process for accepted applications may be accelerated by up to 68% using the developed machine learning model. Meaning Findings of this study suggest that the use of machine learning assistance to prioritize the evaluation of patients with undiagnosed diseases is feasible and may increase the number of applications processed in a given time frame.

show abstract

Publicly Available Clinical

Cited by 1,054 publications

References 26 publications

Hurtful words

Hurtful words

Adapting and evaluating a deep learning language model for clinical why-question answering

Machine Learning of Patient Characteristics to Predict Admission Outcomes in the Undiagnosed Diseases Network

Contact Info

Product

Resources

About