ImportanceChatGPT is an artificial intelligence (AI) chatbot that has significant societal implications. Training curricula using AI are being developed in medicine, and the performance of chatbots in ophthalmology has not been characterized.ObjectiveTo assess the performance of ChatGPT in answering practice questions for board certification in ophthalmology.Design, Setting, and ParticipantsThis cross-sectional study used a consecutive sample of text-based multiple-choice questions provided by the OphthoQuestions practice question bank for board certification examination preparation. Of 166 available multiple-choice questions, 125 (75%) were text-based.ExposuresChatGPT answered questions from January 9 to 16, 2023, and on February 17, 2023.Main Outcomes and MeasuresOur primary outcome was the number of board certification examination practice questions that ChatGPT answered correctly. Our secondary outcomes were the proportion of questions for which ChatGPT provided additional explanations, the mean length of questions and responses provided by ChatGPT, the performance of ChatGPT in answering questions without multiple-choice options, and changes in performance over time.ResultsIn January 2023, ChatGPT correctly answered 58 of 125 questions (46%). ChatGPT’s performance was the best in the category general medicine (11/14; 79%) and poorest in retina and vitreous (0%). The proportion of questions for which ChatGPT provided additional explanations was similar between questions answered correctly and incorrectly (difference, 5.82%; 95% CI, −11.0% to 22.0%; χ21 = 0.45; P = .51). The mean length of questions was similar between questions answered correctly and incorrectly (difference, 21.4 characters; SE, 36.8; 95% CI, −51.4 to 94.3; t = 0.58; df = 123; P = .22). The mean length of responses was similar between questions answered correctly and incorrectly (difference, −80.0 characters; SE, 65.4; 95% CI, −209.5 to 49.5; t = −1.22; df = 123; P = .22). ChatGPT selected the same multiple-choice response as the most common answer provided by ophthalmology trainees on OphthoQuestions 44% of the time. In February 2023, ChatGPT provided a correct response to 73 of 125 multiple-choice questions (58%) and 42 of 78 stand-alone questions (54%) without multiple-choice options.Conclusions and RelevanceChatGPT answered approximately half of questions correctly in the OphthoQuestions free trial for ophthalmic board certification preparation. Medical professionals and trainees should appreciate the advances of AI in medicine while acknowledging that ChatGPT as used in this investigation did not answer sufficient multiple-choice questions correctly for it to provide substantial assistance in preparing for board certification at this time.
This cross-sectional study assesses the accuracy of answers generated by an updated version of a popular chatbot to board certification examination preparation questions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.