2023
DOI: 10.1038/s41746-023-00879-8
|View full text |Cite
|
Sign up to set email alerts
|

The shaky foundations of large language models and foundation models for electronic health records

Abstract: The success of foundation models such as ChatGPT and AlphaFold has spurred significant interest in building similar models for electronic medical records (EMRs) to improve patient care and hospital operations. However, recent hype has obscured critical gaps in our understanding of these models’ capabilities. In this narrative review, we examine 84 foundation models trained on non-imaging EMR data (i.e., clinical text and/or structured data) and create a taxonomy delineating their architectures, training data, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
24
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
4
1

Relationship

2
7

Authors

Journals

citations
Cited by 84 publications
(42 citation statements)
references
References 74 publications
(78 reference statements)
0
24
0
Order By: Relevance
“…The purported benefits need to be defined and evaluations conducted to verify such benefits. 8 Only after these evaluations are completed should statements be allowed such as an LLM was used for a defined task in this specific workflow, it measured a metric, and observed an improvement (or deterioration) in a prespecified outcome. Such evaluations also are necessary to clarify the medicolegal risks that might occur with the use of LLMs to guide medical care, 11 and to identify mitigation strategies for the models' tendency to generate factually incorrect outputs that are probabilistically plausible (called hallucinations).…”
Section: Are the Purported Value Propositions Of Using Llms In Medici...mentioning
confidence: 99%
“…The purported benefits need to be defined and evaluations conducted to verify such benefits. 8 Only after these evaluations are completed should statements be allowed such as an LLM was used for a defined task in this specific workflow, it measured a metric, and observed an improvement (or deterioration) in a prespecified outcome. Such evaluations also are necessary to clarify the medicolegal risks that might occur with the use of LLMs to guide medical care, 11 and to identify mitigation strategies for the models' tendency to generate factually incorrect outputs that are probabilistically plausible (called hallucinations).…”
Section: Are the Purported Value Propositions Of Using Llms In Medici...mentioning
confidence: 99%
“…New and revolutionary technologies are often met with excitement about their many potential uses, leading to widespread and often unfocussed experimentation across different healthcare applications. Thus, as expected, the performance of LLMs in real-world healthcare settings too, remains inconsistently conducted and evaluated 11 12 . For instance, Cadamuro et al assessed ChatGPT-4’s diagnostic ability by evaluating relevance, correctness, helpfulness, and safety, finding responses to be generally superficial and sometimes inaccurate, lacking in helpfulness and safety 13 .…”
Section: Introductionmentioning
confidence: 99%
“…Despite these promising advances, research has yet to systematically develop a simple yet effective framework for learning high-quality representations crucial for robust cell clustering. The learning of high-quality representations is built upon the success of pretraining generalizable models, which aligns with the promise of the current foundation models (i.e., a series of large-scale pretrained models that can be applied on various downstream use cases and tasks) [19, 36, 20]. The foundation model has been instrumental in our understanding of the role of deep learning in the biological context.…”
Section: Introductionmentioning
confidence: 99%