2023
DOI: 10.2196/48023
|View full text |Cite
|
Sign up to set email alerts
|

Accuracy of ChatGPT on Medical Questions in the National Medical Licensing Examination in Japan: Evaluation Study

Yasutaka Yanagita,
Daiki Yokokawa,
Shun Uchida
et al.

Abstract: Background ChatGPT (OpenAI) has gained considerable attention because of its natural and intuitive responses. ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers, as stated by OpenAI as a limitation. However, considering that ChatGPT is an interactive AI that has been trained to reduce the output of unethical sentences, the reliability of the training data is high and the usefulness of the output content is promising. Fortunately, in March 2023, a new version of ChatGPT… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

1
28
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 35 publications
(29 citation statements)
references
References 16 publications
(18 reference statements)
1
28
0
Order By: Relevance
“…6 Other studies have used more complex questions and clinical scenarios and reported accuracy rates ranging from 26.7% to 81.5% for the chatbot, depending on the specific methods employed or the version that was tested. 7,15,16 In common, the studies that compared the two versions of the chatbot have consistently shown superior performance for version 4. 7,15 In our study, open questions, with different levels of complexity, were posed to ChatGPT.…”
Section: Discussionmentioning
confidence: 99%
See 4 more Smart Citations
“…6 Other studies have used more complex questions and clinical scenarios and reported accuracy rates ranging from 26.7% to 81.5% for the chatbot, depending on the specific methods employed or the version that was tested. 7,15,16 In common, the studies that compared the two versions of the chatbot have consistently shown superior performance for version 4. 7,15 In our study, open questions, with different levels of complexity, were posed to ChatGPT.…”
Section: Discussionmentioning
confidence: 99%
“…7,15,16 In common, the studies that compared the two versions of the chatbot have consistently shown superior performance for version 4. 7,15 In our study, open questions, with different levels of complexity, were posed to ChatGPT. We have not prompted the chatbot to base its answers on specific guidelines since we wanted to simulate "real world" scenarios, where a user, not necessarily an expert in F I G U R E 1 Performance of ChatGPT 3.5 and 4 in conceptual and case-based questions.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations