Performance of ChatGPT on the Plastic Surgery Inservice Training Examination

Gupta, Rohun; Herzog, Isabel; Park, John B; Weisberger, Joseph; Firouzbakht, Peter; Ocon, Vanessa; Chao, John; Lee, Edward S.; Mailey, Brian A

doi:10.1093/asj/sjad128

Cited by 46 publications

(19 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These results are similar to 2 studies performed in plastic surgery, reporting 57% accuracy and 55% accuracy on the 2022 Plastic Surgery In‐Service exams. 14 , 15 Our study differed from those studies in that we stratified questions based on difficulty, showing that ChatGPT may be able to answer easy questions with better proficiency, but more in‐depth, nuanced otolaryngology topics are difficult for the chatbot to correctly answer at this time. Within the field of medicine, it is of utmost importance that the tools we use as educational resources, and those that support clinical decision making, are validated.…”

Section: Discussionmentioning

confidence: 59%

Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam

Mahajan,

Shabet,

Smith

et al. 2023

OTO Open

View full text Add to dashboard Cite

ObjectivesThis study seeks to determine the potential use and reliability of a large language learning model for answering questions in a sub‐specialized area of medicine, specifically practice exam questions in otolaryngology–head and neck surgery and assess its current efficacy for surgical trainees and learners.Study Design and SettingAll available questions from a public, paid‐access question bank were manually input through ChatGPT.MethodsOutputs from ChatGPT were compared against the benchmark of the answers and explanations from the question bank. Questions were assessed in 2 domains: accuracy and comprehensiveness of explanations.ResultsOverall, our study demonstrates a ChatGPT correct answer rate of 53% and a correct explanation rate of 54%. We find that with increasing difficulty of questions there is a decreasing rate of answer and explanation accuracy.ConclusionCurrently, artificial intelligence‐driven learning platforms are not robust enough to be reliable medical education resources to assist learners in sub‐specialty specific patient decision making scenarios.

show abstract

Section: Discussionmentioning

confidence: 59%

Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam

Mahajan,

Shabet,

Smith

et al. 2023

OTO Open

View full text Add to dashboard Cite

show abstract

“…Several studies focused on the role of generative AI models in tests of medical knowledge [ 8 - 11 , 13 , 26 , 27 , 31 - 39 ]. These examinations ranged from general medical knowledge tests such as the United States Medical Licensing Exam to specialized examinations in fields like cardiology, neurology, and ophthalmology [ 8 , 9 , 33 , 37 , 38 ].…”

Section: Resultsmentioning

confidence: 99%

Opportunities, Challenges, and Future Directions of Generative Artificial Intelligence in Medical Education: Scoping Review

Preiksaitis,

Rose

2023

JMIR Med Educ

View full text Add to dashboard Cite

Background Generative artificial intelligence (AI) technologies are increasingly being utilized across various fields, with considerable interest and concern regarding their potential application in medical education. These technologies, such as Chat GPT and Bard, can generate new content and have a wide range of possible applications. Objective This study aimed to synthesize the potential opportunities and limitations of generative AI in medical education. It sought to identify prevalent themes within recent literature regarding potential applications and challenges of generative AI in medical education and use these to guide future areas for exploration. Methods We conducted a scoping review, following the framework by Arksey and O'Malley, of English language articles published from 2022 onward that discussed generative AI in the context of medical education. A literature search was performed using PubMed, Web of Science, and Google Scholar databases. We screened articles for inclusion, extracted data from relevant studies, and completed a quantitative and qualitative synthesis of the data. Results Thematic analysis revealed diverse potential applications for generative AI in medical education, including self-directed learning, simulation scenarios, and writing assistance. However, the literature also highlighted significant challenges, such as issues with academic integrity, data accuracy, and potential detriments to learning. Based on these themes and the current state of the literature, we propose the following 3 key areas for investigation: developing learners’ skills to evaluate AI critically, rethinking assessment methodology, and studying human-AI interactions. Conclusions The integration of generative AI in medical education presents exciting opportunities, alongside considerable challenges. There is a need to develop new skills and competencies related to AI as well as thoughtful, nuanced approaches to examine the growing use of generative AI in medical education.

show abstract

“…Despite not being trained on a specific data set, ChatGPT performed at the level of a first-year resident in plastic surgery on the in-service training exam. 7,8 In neurosurgery, ChatGPT performed worse than the average user on Self-Assessment Neurosurgery questions but better than residents in some topics. 9 Clearly, there is already some rudimentary capacity in providing specialty care.…”

Section: Discussionmentioning

confidence: 99%

Diagnostic and Management Applications of ChatGPT in Structured Otolaryngology Clinical Scenarios

Qu,

Qureshi,

Petersen

et al. 2023

OTO Open

View full text Add to dashboard Cite

ObjectiveTo evaluate the clinical applications and limitations of chat generative pretrained transformer (ChatGPT) in otolaryngology.Study DesignCross‐sectional survey.SettingTertiary academic center.MethodsChatGPT 4.0 was queried for diagnoses and management plans for 20 physician‐written clinical vignettes in otolaryngology. Attending physicians were then asked to rate the difficulty of the clinical vignettes and agreement with the differential diagnoses and management plans of ChatGPT responses on a 5‐point Likert scale. Summary statistics were calculated. Univariate ordinal regression was then performed between vignette difficulty and quality of the diagnoses and management plans.ResultsEleven attending physicians completed the survey (61% response rate). Overall, vignettes were rated as very easy to neutral difficulty (range of median score: 1.00‐4.00; overall median 2.00). There was a high agreement with the differential diagnosis provided by ChatGPT (range of median score: 3.00‐5.00; overall median: 5.00). There was also high agreement with treatment plans (range of median score: 3.00‐5.00; overall median: 5.00). There was no association between vignette difficulty and agreement with differential diagnosis or treatment. Lower diagnosis scores had greater odds of having lower treatment scores.ConclusionGenerative artificial intelligence models like ChatGPT are being rapidly adopted in medicine. Performance with curated, easy‐to‐moderate difficulty otolaryngology scenarios indicate high agreement with physicians for diagnosis and management. However, a decreased quality in diagnosis is associated with decreased quality in management. Further research is necessary on ChatGPT's ability to handle unstructured clinical information.

show abstract

Performance of ChatGPT on the Plastic Surgery Inservice Training Examination

Cited by 46 publications

References 8 publications

Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam

Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam

Opportunities, Challenges, and Future Directions of Generative Artificial Intelligence in Medical Education: Scoping Review

Diagnostic and Management Applications of ChatGPT in Structured Otolaryngology Clinical Scenarios

Contact Info

Product

Resources

About