Background Artificial Intelligence (AI) is a promising tool for cardiothoracic ratio (CTR) measurement that has been technically validated but not clinically evaluated on a large dataset. We observed and validated AI and manual methods for CTR measurement using a large dataset and investigated the clinical utility of the AI method. Methods Five thousand normal chest x-rays and 2,517 images with cardiomegaly and CTR values, were analyzed using manual, AI-assisted, and AI-only methods. AI-only methods obtained CTR values from a VGG-16 U-Net model. An in-house software was used to aid the manual and AI-assisted measurements and to record operating time. Intra and inter-observer experiments were performed on manual and AI-assisted methods and the averages were used in a method variation study. AI outcomes were graded in the AI-assisted method as excellent (accepted by both users independently), good (required adjustment), and poor (failed outcome). Bland–Altman plot with coefficient of variation (CV), and coefficient of determination (R-squared) were used to evaluate agreement and correlation between measurements. Finally, the performance of a cardiomegaly classification test was evaluated using a CTR cutoff at the standard (0.5), optimum, and maximum sensitivity. Results Manual CTR measurements on cardiomegaly data were comparable to previous radiologist reports (CV of 2.13% vs 2.04%). The observer and method variations from the AI-only method were about three times higher than from the manual method (CV of 5.78% vs 2.13%). AI assistance resulted in 40% excellent, 56% good, and 4% poor grading. AI assistance significantly improved agreement on inter-observer measurement compared to manual methods (CV; bias: 1.72%; − 0.61% vs 2.13%; − 1.62%) and was faster to perform (2.2 ± 2.4 secs vs 10.6 ± 1.5 secs). The R-squared and classification-test were not reliable indicators to verify that the AI-only method could replace manual operation. Conclusions AI alone is not yet suitable to replace manual operations due to its high variation, but it is useful to assist the radiologist because it can reduce observer variation and operation time. Agreement of measurement should be used to compare AI and manual methods, rather than R-square or classification performance tests.
Background Artificial Intelligence (AI) technique for cardiothoracic ratio (CTR) measurement is a promising tool that has been technically validated but not clinically evaluated on a large dataset. This study observes and validates AI and manual methods for CTR measurement on a large dataset and investigates the clinical utility of the AI method. Results Five thousand normal chest x-rays and 2,517 images with cardiomegaly and CTR values, were analyzed using manual, AI-assisted, and AI only methods. AI methods obtained CTR values from a VGG-16 U-Net model. An in-house software was used to aid the study and to record measurement time. Intra and inter-observer experiments were performed on manual and AI-assisted methods and the average of each method was employed in a method variation study. AI outcomes were graded in the AI-assisted method as excellent (accepted by both users independently), good (required adjustment), and poor (failed outcome). Bland-Altman plot with coefficient of variation (CV), and coefficient of determination (R-squared) were employed to evaluate agreement and correlation between measurements. Finally, the performance of a cardiomegaly classification test was evaluated using a CTR cutoff at the standard (0.5), optimum, and maximum sensitivity. Manual CTR measurements on cardiomegaly data were comparable to the previous radiologist reports (CV of 2.13% vs 2.04%). The observer and method variations from the AI method were about three times higher than from the manual method (CV of 5.78% vs 2.13%). AI assistance resulted in 40% excellent, 56% good, and 4% poor grading. AI assistance significantly improved agreement on inter-observer measurement compared to manual methods (CV; bias: 1.72%; -0.61% vs 2.13%; -1.62%) and was faster to perform (2.2 ± 2.4 secs vs 10.6 ± 1.5 secs). R-squared and classification-test were not reliable indicators to verify that the AI method could replace manual operation. Conclusion AI alone is not suitable to replace manual operation due to its high variation, but it is useful to assist the radiologist because it can reduce observer variation and operation time. Agreement of measurement should be used to compare AI and manual methods, rather than R-square or classification performance tests.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.