Assessment of Pathology Domain-Specific Knowledge of ChatGPT and Comparison to Human Performance

Wang, Andrew Y.; Lin, Sherman; Tran, Christopher; Homer, Robert J.; Wilsdon, Dan; Walsh, Joanna C.; Goebel, Emily A.; Sansano, Irene; Sonawane, Snehal; Cockenpot, Vincent; Mukhopadhyay, Sanjay; Taskin, Toros; Zahra, Nusrat; Cima, Luca; Semerci, Orhan; Özamrak, Birsen Gizem; Mishra, Pallavi; Vennavalli, Naga Sarika; Chen, Po-Hsuan Cameron; Cecchini, Matthew J.

doi:10.5858/arpa.2023-0296-oa

Cited by 10 publications

(8 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It was also noted that in our study, excessive verbosity was shown to be frequently linked to evasive and generic answers, leading to incorrect responses. In contrast to a recent study that showed a significant improvement in GPT-4's performance when compared to both its predecessor and human volunteers, our investigation revealed that GPT-4 performed worse than humans [17].…”

Section: Discussioncontrasting

confidence: 99%

ChatGPT Versus National Eligibility cum Entrance Test for Postgraduate (NEET PG)

Paul,

Govindaraj,

2024

Cureus

View full text Add to dashboard Cite

Introduction With both suspicion and excitement, artificial intelligence tools are being integrated into nearly every aspect of human existence, including medical sciences and medical education. The newest large language model (LLM) in the class of autoregressive language models is ChatGPT. While ChatGPT’s potential to revolutionize clinical practice and medical education is under investigation, further research is necessary to understand its strengths and limitations in this field comprehensively. Methods Two hundred National Eligibility cum Entrance Test for Postgraduate 2023 questions were gathered from various public education websites and individually entered into Microsoft Bing (GPT-4 Version 2.2.1). Microsoft Bing Chatbot is currently the only platform incorporating all of GPT-4’s multimodal features, including image recognition. The results were subsequently analyzed. Results Out of 200 questions, ChatGPT-4 answered 129 correctly. The most tested specialties were medicine (15%), obstetrics and gynecology (15%), general surgery (14%), and pathology (10%), respectively. Conclusion This study sheds light on how well the GPT-4 performs in addressing the NEET-PG entrance test. ChatGPT has potential as an adjunctive instrument within medical education and clinical settings. Its capacity to react intelligently and accurately in complicated clinical settings demonstrates its versatility.

show abstract

Section: Discussioncontrasting

confidence: 99%

ChatGPT Versus National Eligibility cum Entrance Test for Postgraduate (NEET PG)

Paul,

Govindaraj,

2024

Cureus

View full text Add to dashboard Cite

show abstract

“…It is important to note the shift from GPT-3.5 to GPT-4 marks a substantial improvement in the functionality and efficiency of large language models in various medical subspecialties. [18][19][20][21][22] Our findings show that GPT-4′s accuracy in responding to questions related to hyperten-sion has increased by 20%, achieving a 77% success rate. 10 Despite the need for a subscription to access GPT-4, its enhanced performance strongly supports its recommended use.…”

Section: Selecting Advanced Chatgpt Modelmentioning

confidence: 99%

Enhancing clinical decision‐making: Optimizing ChatGPT's performance in hypertension care

Miao,

Thongprayoon,

Fülöp

et al. 2024

J of Clinical Hypertension

View full text Add to dashboard Cite

ChatGPT can quickly process vast information, including medical guidelines and patient data, helping healthcare providers improve diagnostic accuracy, tailor treatments, and enhance patient outcomes. 11 This represents a move towards digital health, with AI like ChatGPT elevating care standards through insightful support (Figure 1). Guideline summarization and updatesChatGPT can swiftly provide summaries of the latest hypertension management guidelines from leading organizations like the American College of Cardiology, American Heart Association, or European Society of Cardiology, ensuring healthcare professionals stay up-to-date with best practices. It can be set to notify users about new studies, guidelines, or updates in hypertension care, either regularly or on demand. While ChatGPT alone cannot monitor new information in real-time, integrating it with tools like RSS Feeds and Alert Systems (Table 1) can enhance decision-making in hypertension care by keeping professionals informed about new research and innovative treatments.

show abstract

“…ChatGPT can provide an interactive educational experience, summarize important concepts [16], and give instant access to information [17]. It can tailor the educational experience according to an individual's needs by providing personalized responses to questions with immediate feedback [18].…”

Section: Educationmentioning

confidence: 99%

“…In a paper using ChatGPT to answer pathology-related questions, such as "explain why transfusion-related diseases are avoidable", it was able to provide credible responses in most cases, achieving an overall median rating of 4.08 out 5 [21]. In a recently published study, the domain-specific knowledge of ChatGPT 3.5 in pathology was assessed to be the same level as a staff pathologist, while ChatGPT 4 exceeded that of a trained pathologist [17]. There may be instances where ChatGPT may not have information about certain topics in its knowledge base.…”

Section: Educationmentioning

confidence: 99%

Applications of Large Language Models in Pathology

Cheng

2024

Bioengineering

View full text Add to dashboard Cite

Large language models (LLMs) are transformer-based neural networks that can provide human-like responses to questions and instructions. LLMs can generate educational material, summarize text, extract structured data from free text, create reports, write programs, and potentially assist in case sign-out. LLMs combined with vision models can assist in interpreting histopathology images. LLMs have immense potential in transforming pathology practice and education, but these models are not infallible, so any artificial intelligence generated content must be verified with reputable sources. Caution must be exercised on how these models are integrated into clinical practice, as these models can produce hallucinations and incorrect results, and an over-reliance on artificial intelligence may lead to de-skilling and automation bias. This review paper provides a brief history of LLMs and highlights several use cases for LLMs in the field of pathology.

show abstract

Assessment of Pathology Domain-Specific Knowledge of ChatGPT and Comparison to Human Performance

Cited by 10 publications

References 0 publications

ChatGPT Versus National Eligibility cum Entrance Test for Postgraduate (NEET PG)

ChatGPT Versus National Eligibility cum Entrance Test for Postgraduate (NEET PG)

Enhancing clinical decision‐making: Optimizing ChatGPT's performance in hypertension care

Applications of Large Language Models in Pathology

Contact Info

Product

Resources

About