2024
DOI: 10.5858/arpa.2023-0296-oa
|View full text |Cite
|
Sign up to set email alerts
|

Assessment of Pathology Domain-Specific Knowledge of ChatGPT and Comparison to Human Performance

Andrew Y. Wang,
Sherman Lin,
Christopher Tran
et al.

Abstract: Context.— Artificial intelligence algorithms hold the potential to fundamentally change many aspects of society. Application of these tools, including the publicly available ChatGPT, has demonstrated impressive domain-specific knowledge in many areas, including medicine. Objectives.— To understand the level of pathology domain-specific knowledge for ChatGPT using different underlying large language models, GPT-3.5 and the upd… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
2

Year Published

2024
2024
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 0 publications
0
6
2
Order By: Relevance
“…It was also noted that in our study, excessive verbosity was shown to be frequently linked to evasive and generic answers, leading to incorrect responses. In contrast to a recent study that showed a significant improvement in GPT-4's performance when compared to both its predecessor and human volunteers, our investigation revealed that GPT-4 performed worse than humans [17].…”
Section: Discussioncontrasting
confidence: 99%
“…It was also noted that in our study, excessive verbosity was shown to be frequently linked to evasive and generic answers, leading to incorrect responses. In contrast to a recent study that showed a significant improvement in GPT-4's performance when compared to both its predecessor and human volunteers, our investigation revealed that GPT-4 performed worse than humans [17].…”
Section: Discussioncontrasting
confidence: 99%
“…It is important to note the shift from GPT-3.5 to GPT-4 marks a substantial improvement in the functionality and efficiency of large language models in various medical subspecialties. [18][19][20][21][22] Our findings show that GPT-4′s accuracy in responding to questions related to hyperten-sion has increased by 20%, achieving a 77% success rate. 10 Despite the need for a subscription to access GPT-4, its enhanced performance strongly supports its recommended use.…”
Section: Selecting Advanced Chatgpt Modelmentioning
confidence: 99%
“…ChatGPT can provide an interactive educational experience, summarize important concepts [16], and give instant access to information [17]. It can tailor the educational experience according to an individual's needs by providing personalized responses to questions with immediate feedback [18].…”
Section: Educationmentioning
confidence: 99%
“…In a paper using ChatGPT to answer pathology-related questions, such as "explain why transfusion-related diseases are avoidable", it was able to provide credible responses in most cases, achieving an overall median rating of 4.08 out 5 [21]. In a recently published study, the domain-specific knowledge of ChatGPT 3.5 in pathology was assessed to be the same level as a staff pathologist, while ChatGPT 4 exceeded that of a trained pathologist [17]. There may be instances where ChatGPT may not have information about certain topics in its knowledge base.…”
Section: Educationmentioning
confidence: 99%