Purpose
To evaluate the ability of an artificial intelligence (AI) model, ChatGPT, in predicting the diabetic retinopathy (DR) risk.
Methods
This retrospective observational study utilized an anonymized dataset of 111 patients with diabetes who underwent a comprehensive eye examination along with clinical and biochemical assessments. Clinical and biochemical data along with and without central subfield thickness (CST) values of the macula from OCT were uploaded to ChatGPT-4, and the response from the ChatGPT was compared to the clinical DR diagnosis made by an ophthalmologist.
Results
The study assessed the consistency of responses provided by ChatGPT, yielding an Intraclass Correlation Coefficient (ICC) value of 0.936 (95% CI, 0.913–0.954, p < 0.001) (with CST) and 0.915 (95% CI, 0.706–0.846, p < 0.001) (without CST), both situations indicated excellent reliability. The sensitivity and specificity of ChatGPT in predicting the DR cases were evaluated. The results revealed a sensitivity of 67% with CST and 73% without CST. The specificity was 68% with CST and 54% without CST. However, Cohen’s kappa revealed only a fair agreement between ChatGPT predictions and clinical DR status in both situations, with CST (kappa = 0.263, p = 0.005) and without CST (kappa = 0.351, p < 0.001).
Conclusion
This study suggests that ChatGPT has the potential of a preliminary DR screening tool with further optimization needed for clinical use.