BackgroundDeep learning (DL) based solutions have been proposed for interpretation of several imaging modalities including radiography, CT, and MR. For chest radiographs, DL algorithms have found success in the evaluation of abnormalities such as lung nodules, pulmonary tuberculosis, cystic fibrosis, pneumoconiosis, and location of peripherally inserted central catheters. Chest radiography represents the most commonly performed radiological test for a multitude of non-emergent and emergent clinical indications. This study aims to assess accuracy of deep learning (DL) algorithm for detection of abnormalities on routine frontal chest radiographs (CXR), and assessment of stability or change in findings over serial radiographs.Methods and findingsWe processed 874 de-identified frontal CXR from 724 adult patients (> 18 years) with DL (Qure AI). Scores and prediction statistics from DL were generated and recorded for the presence of pulmonary opacities, pleural effusions, hilar prominence, and enlarged cardiac silhouette. To establish a standard of reference (SOR), two thoracic radiologists assessed all CXR for these abnormalities. Four other radiologists (test radiologists), unaware of SOR and DL findings, independently assessed the presence of radiographic abnormalities. A total 724 radiographs were assessed for detection of findings. A subset of 150 radiographs with follow up examinations was used to asses change over time. Data were analyzed with receiver operating characteristics analyses and post-hoc power analysis.ResultsAbout 42% (305/ 724) CXR had no findings according to SOR; single and multiple abnormalities were seen in 23% (168/724) and 35% (251/724) of CXR. There was no statistical difference between DL and SOR for all abnormalities (p = 0.2–0.8). The area under the curve (AUC) for DL and test radiologists ranged between 0.837–0.929 and 0.693–0.923, respectively. DL had lowest AUC (0.758) for assessing changes in pulmonary opacities over follow up CXR. Presence of chest wall implanted devices negatively affected the accuracy of DL algorithm for evaluation of pulmonary and hilar abnormalities.ConclusionsDL algorithm can aid in interpretation of CXR findings and their stability over follow up CXR. However, in its present version, it is unlikely to replace radiologists due to its limited specificity for categorizing specific findings.
IMPORTANCEMost early lung cancers present as pulmonary nodules on imaging, but these can be easily missed on chest radiographs. OBJECTIVE To assess if a novel artificial intelligence (AI) algorithm can help detect pulmonary nodules on radiographs at different levels of detection difficulty. DESIGN, SETTING, AND PARTICIPANTSThis diagnostic study included 100 posteroanterior chest radiograph images taken between 2000 and 2010 of adult patients from an ambulatory health care center in Germany and a lung image database in the US. Included images were selected to represent nodules with different levels of detection difficulties (from easy to difficult), and comprised both normal and nonnormal control.EXPOSURES All images were processed with a novel AI algorithm, the AI Rad Companion Chest X-ray. Two thoracic radiologists established the ground truth and 9 test radiologists from Germany and the US independently reviewed all images in 2 sessions (unaided and AI-aided mode) with at least a 1-month washout period. MAIN OUTCOMES AND MEASURESEach test radiologist recorded the presence of 5 findings (pulmonary nodules, atelectasis, consolidation, pneumothorax, and pleural effusion) and their level of confidence for detecting the individual finding on a scale of 1 to 10 (1 representing lowest confidence; 10, highest confidence). The analyzed metrics for nodules included sensitivity, specificity, accuracy, and receiver operating characteristics curve area under the curve (AUC). RESULTSImages from 100 patients were included, with a mean (SD) age of 55 (20) years and including 64 men and 36 women. Mean detection accuracy across the 9 radiologists improved by 6.4% (95% CI, 2.3% to 10.6%) with AI-aided interpretation compared with unaided interpretation. Partial AUCs within the effective interval range of 0 to 0.2 false positive rate improved by 5.6% (95% CI, −1.4% to 12.0%) with AI-aided interpretation. Junior radiologists saw greater improvement in sensitivity for nodule detection with AI-aided interpretation as compared with their senior counterparts (12%; 95% CI, 4% to 19% vs 9%; 95% CI, 1% to 17%) while senior radiologists experienced similar improvement in specificity (4%; 95% CI, −2% to 9%) as compared with junior radiologists (4%; 95% CI, −3% to 5%). CONCLUSIONS AND RELEVANCEIn this diagnostic study, an AI algorithm was associated with improved detection of pulmonary nodules on chest radiographs compared with unaided interpretation for different levels of detection difficulty and for readers with different experience.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.