Evaluating ChatGPT on Orbital and Oculofacial Disorders: Accuracy and Readability Insights

Balas, Michael; Janic, Ana; Daigle, Patrick; Nijhawan, Navdeep; Hussain, Ahsen; Gill, Harmeet; Lahaie, Gabriela L.; Belliveau, Michel J.; Crawford, Sean A.; Arjmand, Parnian; Ing, Edsel B.

doi:10.1097/iop.0000000000002552

Cited by 3 publications

(2 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…[9] It was also shown that ChatGPT performs accurately when responding to questions about orbital and oculofacial disorders, with an average appropriateness score of 5.3/6.0 ("mostly appropriate" to "completely appropriate"). [10] Our study found that ChatGPT scored best in the infectious disorders section (73.3%), and poorest in the retinal disorders section (50%). Antaki et al showed that the legacy model performed best in general medicine (75%), fundamentals (60%), and cornea (60%), but as well in glaucoma, (37.5%), and pediatrics and strabismus (42.5%), and neuro-ophthalmology (25%), [7] which is in contradiction to Madadi et al findings that showed the potential to diagnose cases related to neuroophthalmology with comparable accuracy to certified neuroophthalmologist, with estimated accuracy of 59% and 82% for ChatGPT 3.5 and ChatGPT 4.0, respectively.…”

Section: Discussionmentioning

confidence: 46%

Evaluating accuracy and reproducibility of ChatGPT responses to patient-based questions in Ophthalmology: An observational study

Alqudah,

Aleshawi,

Baker

et al. 2024

Medicine

View full text Add to dashboard Cite

Chat Generative Pre-Trained Transformer (ChatGPT) is an online large language model that appears to be a popular source of health information, as it can provide patients with answers in the form of human-like text, although the accuracy and safety of its responses are not evident. This study aims to evaluate the accuracy and reproducibility of ChatGPT responses to patients-based questions in ophthalmology. We collected 150 questions from the “Ask an ophthalmologist” page of the American Academy of Ophthalmology, which were reviewed and refined by two ophthalmologists for their eligibility. Each question was inputted into ChatGPT twice using the “new chat” option. The grading scale included the following: (1) comprehensive, (2) correct but inadequate, (3) some correct and some incorrect, and (4) completely incorrect. Totally, 117 questions were inputted into ChatGPT, which provided “comprehensive” responses to 70/117 (59.8%) of questions. Concerning reproducibility, it was defined as no difference in grading categories (1 and 2 vs 3 and 4) between the 2 responses for each question. ChatGPT provided reproducible responses to 91.5% of questions. This study shows moderate accuracy and reproducibility of ChatGPT responses to patients’ questions in ophthalmology. ChatGPT may be—after more modifications—a supplementary health information source, which should be used as an adjunct, but not a substitute, to medical advice. The reliability of ChatGPT should undergo more investigations.

show abstract

Section: Discussionmentioning

confidence: 46%

Evaluating accuracy and reproducibility of ChatGPT responses to patient-based questions in Ophthalmology: An observational study

Alqudah,

Aleshawi,

Baker

et al. 2024

Medicine

View full text Add to dashboard Cite

show abstract

“…Sarcoidosis is an idiopathic multisystem disorder wherein ocular involvement is seen in approximately 25% of the cases with anterior uveitis being the most common presentation. 1 It can affect any part of the eye including the orbit or lacrimal system with dacryoadenitis being the most common extraocular involvement. The reported incidence of cutaneous involvement in sarcoidosis is 12% to 27% and the incidence of eyelid involvement is even lower.…”

Section: To the Editormentioning

confidence: 99%

Re: “Orbital and Oculofacial Diseases and Artificial Intelligence: Evaluating the Accuracy and Readability of ChatGPT”

Daungsupawong,

Wiwanitkit

2024

Ophthalmic Plastic &Amp; Reconstructive Surgery

View full text Add to dashboard Cite

Vision of the future: large language models in ophthalmology

Tailor,

D'Souza,

et al. 2024

Current Opinion in Ophthalmology

View full text Add to dashboard Cite

Purpose of review Large language models (LLMs) are rapidly entering the landscape of medicine in areas from patient interaction to clinical decision-making. This review discusses the evolving role of LLMs in ophthalmology, focusing on their current applications and future potential in enhancing ophthalmic care. Recent findings LLMs in ophthalmology have demonstrated potential in improving patient communication and aiding preliminary diagnostics because of their ability to process complex language and generate human-like domain-specific interactions. However, some studies have shown potential for harm and there have been no prospective real-world studies evaluating the safety and efficacy of LLMs in practice. Summary While current applications are largely theoretical and require rigorous safety testing before implementation, LLMs exhibit promise in augmenting patient care quality and efficiency. Challenges such as data privacy and user acceptance must be overcome before LLMs can be fully integrated into clinical practice.

show abstract

Evaluating ChatGPT on Orbital and Oculofacial Disorders: Accuracy and Readability Insights

Cited by 3 publications

References 31 publications

Evaluating accuracy and reproducibility of ChatGPT responses to patient-based questions in Ophthalmology: An observational study

Evaluating accuracy and reproducibility of ChatGPT responses to patient-based questions in Ophthalmology: An observational study

Re: “Orbital and Oculofacial Diseases and Artificial Intelligence: Evaluating the Accuracy and Readability of ChatGPT”

Vision of the future: large language models in ophthalmology

Contact Info

Product

Resources

About