Large language models encode clinical knowledge

Singhal, Karan; Azizi, Shekoofeh; Tu, Tao; Mahdavi, Sara; Lee, Jason; Chung, Hyung Won; Scales, Nathan; Tanwani, Ajay Kumar; Cole-Lewis, Heather; Pfohl, Stephen; Payne, Perry W.; Seneviratne, Martin; Gamble, Paul; Kelly, Christopher; Babiker, Abubakr; Schärli, Nathanael; Chowdhery, Aakanksha; Mansfield, P.; Demner‐Fushman, Dina; Arcas, Blaise Agüera y; Webster, Dale R.; Corrado, Greg S.; Matias, Yossi; Chou, Katherine; Gottweis, Juraj; Tomašev, Nenad; Liu, Yun; Rajkomar, Alvin; Barral, Joëlle K.; Semturs, Christopher; Karthikesalingam, Alan; Natarajan, Vivek

doi:10.1038/s41586-023-06291-2

Cited by 424 publications

(312 citation statements)

References 80 publications

Supporting

Mentioning

261

Contrasting

Unclassified

Order By: Relevance

“…More recently, however, advances in large language models (LLMs) have created an opportunity to adapt AI technology into a tool for mediating human interaction. 6,7…”

mentioning

confidence: 99%

“…More recently, however, advances in large language models (LLMs) have created an opportunity to adapt AI technology into a tool for mediating human interaction. 6,7 LLMs are "foundation models," or large, pretrained, multilayer deep neural networks that leverage contextual relationships between words, phrases, and concepts to predict the likelihood of the next sequence of words. 8 With adequate training and application, these AI systems can process complex information, analyze relationships between ideas, and generate coherent responses to an inquiry.…”

mentioning

confidence: 99%

See 1 more Smart Citation

RETRACTED: New Artificial Intelligence ChatGPT Performs Poorly on the 2022 Self-assessment Study Program for Urology

et al. 2023

View full text Add to dashboard Cite

Introduction:Large language models have demonstrated impressive capabilities, but application to medicine remains unclear. We seek to evaluate the use of ChatGPT on the American Urological Association Self-assessment Study Program as an educational adjunct for urology trainees and practicing physicians.Methods:One hundred fifty questions from the 2022 Self-assessment Study Program exam were screened, and those containing visual assets (n=15) were removed. The remaining items were encoded as open ended or multiple choice. ChatGPT’s output was coded as correct, incorrect, or indeterminate; if indeterminate, responses were regenerated up to 2 times. Concordance, quality, and accuracy were ascertained by 3 independent researchers and reviewed by 2 physician adjudicators. A new session was started for each entry to avoid crossover learning.Results:ChatGPT was correct on 36/135 (26.7%) open-ended and 38/135 (28.2%) multiple-choice questions. Indeterminate responses were generated in 40 (29.6%) and 4 (3.0%), respectively. Of the correct responses, 24/36 (66.7%) and 36/38 (94.7%) were on initial output, 8 (22.2%) and 1 (2.6%) on second output, and 4 (11.1%) and 1 (2.6%) on final output, respectively. Although regeneration decreased indeterminate responses, proportion of correct responses did not increase. For open-ended and multiple-choice questions, ChatGPT provided consistent justifications for incorrect answers and remained concordant between correct and incorrect answers.Conclusions:ChatGPT previously demonstrated promise on medical licensing exams; however, application to the 2022 Self-assessment Study Program was not demonstrated. Performance improved with multiple-choice over open-ended questions. More importantly were the persistent justifications for incorrect responses—left unchecked, utilization of ChatGPT in medicine may facilitate medical misinformation.

show abstract

“…More recently, however, advances in large language models (LLMs) have created an opportunity to adapt AI technology into a tool for mediating human interaction. 6,7…”

mentioning

confidence: 99%

mentioning

confidence: 99%

RETRACTED: New Artificial Intelligence ChatGPT Performs Poorly on the 2022 Self-assessment Study Program for Urology

et al. 2023

View full text Add to dashboard Cite

show abstract

“…5,6 Only weeks after the release of ChatGPT, Google's DeepMind released MedPaLM, an LLM designed to answer medical questions. 7 What makes ChatGPT unique is that it is free and publicly available, boasts a user-friendly interface, and was trained with a data set that is larger than previous LLMs. ChatGPT has made generative AI technology more tangible than ever before, as evidenced by its rapid uptake by the public.…”

Section: What Is Chatgpt?mentioning

confidence: 99%

Harnessing Generative Artificial Intelligence to Improve Efficiency Among Urologists: Welcome ChatGPT

2023

View full text Add to dashboard Cite

“…In parallel, large scale diffusion models 274 have democratized the generation of high‐resolution image data using text‐prompts single sentence alone. The ramifications of this technology are only just being explored in the context of medicine 275 but the next half decade will inevitably find their utility in CPATH.…”

Section: Challenges and Opportunitiesmentioning

confidence: 99%

Machine learning in computational histopathology: Challenges and opportunities

Cooper

Krishnan

2023

Genes Chromosomes & Cancer

View full text Add to dashboard Cite

Digital histopathological images, high-resolution images of stained tissue samples, are a vital tool for clinicians to diagnose and stage cancers. The visual analysis of patient state based on these images are an important part of oncology workflow. Although pathology workflows have historically been conducted in laboratories under a microscope, the increasing digitization of histopathological images has led to their analysis on computers in the clinic. The last decade has seen the emergence of machine learning, and deep learning in particular, a powerful set of tools for the analysis of histopathological images. Machine learning models trained on large datasets of digitized histopathology slides have resulted in automated models for prediction and stratification of patient risk. In this review, we provide context for the rise of such models in computational histopathology, highlight the clinical tasks they have found success in automating, discuss the various machine learning techniques that have been applied to this domain, and underscore open problems and opportunities.

show abstract

Large language models encode clinical knowledge

Cited by 424 publications

References 80 publications

RETRACTED: New Artificial Intelligence ChatGPT Performs Poorly on the 2022 Self-assessment Study Program for Urology

RETRACTED: New Artificial Intelligence ChatGPT Performs Poorly on the 2022 Self-assessment Study Program for Urology

Harnessing Generative Artificial Intelligence to Improve Efficiency Among Urologists: Welcome ChatGPT

Machine learning in computational histopathology: Challenges and opportunities

Contact Info

Product

Resources

About