2023
DOI: 10.1038/s41586-023-06291-2
|View full text |Cite|
|
Sign up to set email alerts
|

Large language models encode clinical knowledge

Abstract: Large language models (LLMs) have demonstrated impressive capabilities, but the bar for clinical applications is high. Attempts to assess the clinical knowledge of models typically rely on automated evaluations based on limited benchmarks. Here, to address these limitations, we present MultiMedQA, a benchmark combining six existing medical question answering datasets spanning professional medicine, research and consumer queries and a new dataset of medical questions searched online, HealthSearchQA. We propose … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

4
261
1
2

Year Published

2023
2023
2024
2024

Publication Types

Select...
9

Relationship

1
8

Authors

Journals

citations
Cited by 424 publications
(312 citation statements)
references
References 80 publications
4
261
1
2
Order By: Relevance
“…More recently, however, advances in large language models (LLMs) have created an opportunity to adapt AI technology into a tool for mediating human interaction. 6,7…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…More recently, however, advances in large language models (LLMs) have created an opportunity to adapt AI technology into a tool for mediating human interaction. 6,7…”
mentioning
confidence: 99%
“…More recently, however, advances in large language models (LLMs) have created an opportunity to adapt AI technology into a tool for mediating human interaction. 6,7 LLMs are "foundation models," or large, pretrained, multilayer deep neural networks that leverage contextual relationships between words, phrases, and concepts to predict the likelihood of the next sequence of words. 8 With adequate training and application, these AI systems can process complex information, analyze relationships between ideas, and generate coherent responses to an inquiry.…”
mentioning
confidence: 99%
“…5,6 Only weeks after the release of ChatGPT, Google's DeepMind released MedPaLM, an LLM designed to answer medical questions. 7 What makes ChatGPT unique is that it is free and publicly available, boasts a user-friendly interface, and was trained with a data set that is larger than previous LLMs. ChatGPT has made generative AI technology more tangible than ever before, as evidenced by its rapid uptake by the public.…”
Section: What Is Chatgpt?mentioning
confidence: 99%
“…In parallel, large scale diffusion models 274 have democratized the generation of high‐resolution image data using text‐prompts single sentence alone. The ramifications of this technology are only just being explored in the context of medicine 275 but the next half decade will inevitably find their utility in CPATH.…”
Section: Challenges and Opportunitiesmentioning
confidence: 99%