; for the Imaging and Informatics in Retinopathy of Prematurity (i-ROP) Research Consortium IMPORTANCE Retinopathy of prematurity (ROP) is a leading cause of childhood blindness worldwide. The decision to treat is primarily based on the presence of plus disease, defined as dilation and tortuosity of retinal vessels. However, clinical diagnosis of plus disease is highly subjective and variable. OBJECTIVE To implement and validate an algorithm based on deep learning to automatically diagnose plus disease from retinal photographs. DESIGN, SETTING, AND PARTICIPANTS A deep convolutional neural network was trained using a data set of 5511 retinal photographs. Each image was previously assigned a reference standard diagnosis (RSD) based on consensus of image grading by 3 experts and clinical diagnosis by 1 expert (ie, normal, pre-plus disease, or plus disease). The algorithm was evaluated by 5-fold cross-validation and tested on an independent set of 100 images. Images were collected from 8 academic institutions participating in the Imaging and Informatics in ROP (i-ROP) cohort study. The deep learning algorithm was tested against 8 ROP experts, each of whom had more than 10 years of clinical experience and more than 5 peer-reviewed publications about ROP.
BackgroundPrior work has demonstrated the near-perfect accuracy of a deep learning retinal image analysis system for diagnosing plus disease in retinopathy of prematurity (ROP). Here we assess the screening potential of this scoring system by determining its ability to detect all components of ROP diagnosis.MethodsClinical examination and fundus photography were performed at seven participating centres. A deep learning system was trained to detect plus disease, generating a quantitative assessment of retinal vascular abnormality (the i-ROP plus score) on a 1–9 scale. Overall ROP disease category was established using a consensus reference standard diagnosis combining clinical and image-based diagnosis. Experts then ranked ordered a second data set of 100 posterior images according to overall ROP severity.Results4861 examinations from 870 infants were analysed. 155 examinations (3%) had a reference standard diagnosis of type 1 ROP. The i-ROP deep learning (DL) vascular severity score had an area under the receiver operating curve of 0.960 for detecting type 1 ROP. Establishing a threshold i-ROP DL score of 3 conferred 94% sensitivity, 79% specificity, 13% positive predictive value and 99.7% negative predictive value for type 1 ROP. There was strong correlation between expert rank ordering of overall ROP severity and the i-ROP DL vascular severity score (Spearman correlation coefficient=0.93; p<0.0001).ConclusionThe i-ROP DL system accurately identifies diagnostic categories and overall disease severity in an automated fashion, after being trained only on posterior pole vascular morphology. These data provide proof of concept that a deep learning screening platform could improve objectivity of ROP diagnosis and accessibility of screening.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.