Background: Artificial intelligence (AI) for echocardiography requires training and validation to standards expected of humans. We developed an online platform and established the Unity Collaborative to build a dataset of expertise from 17 hospitals for training, validation, and standardization of such techniques. Methods: The training dataset consisted of 2056 individual frames drawn at random from 1265 parasternal long-axis video-loops of patients undergoing clinical echocardiography in 2015 to 2016. Nine experts labeled these images using our online platform. From this, we trained a convolutional neural network to identify keypoints. Subsequently, 13 experts labeled a validation dataset of the end-systolic and end-diastolic frame from 100 new video-loops, twice each. The 26-opinion consensus was used as the reference standard. The primary outcome was precision SD, the SD of the differences between AI measurement and expert consensus. Results: In the validation dataset, the AI’s precision SD for left ventricular internal dimension was 3.5 mm. For context, precision SD of individual expert measurements against the expert consensus was 4.4 mm. Intraclass correlation coefficient between AI and expert consensus was 0.926 (95% CI, 0.904–0.944), compared with 0.817 (0.778–0.954) between individual experts and expert consensus. For interventricular septum thickness, precision SD was 1.8 mm for AI (intraclass correlation coefficient, 0.809; 0.729–0.967), versus 2.0 mm for individuals (intraclass correlation coefficient, 0.641; 0.568–0.716). For posterior wall thickness, precision SD was 1.4 mm for AI (intraclass correlation coefficient, 0.535 [95% CI, 0.379–0.661]), versus 2.2 mm for individuals (0.366 [0.288–0.462]). We present all images and annotations. This highlights challenging cases, including poor image quality and tapered ventricles. Conclusions: Experts at multiple institutions successfully cooperated to build a collaborative AI. This performed as well as individual experts. Future echocardiographic AI research should use a consensus of experts as a reference. Our collaborative welcomes new partners who share our commitment to publish all methods, code, annotations, and results openly.
Background Left ventricular longitudinal strain has been reported to deliver reproducibility, sensitivity and prognostic value over and above ejection fraction. However, it currently relies on uninspectable proprietary algorithms and suffers from a lack of widespread clinical use. Uptake may be improved by increasing user trust through greater transparency. Purpose We therefore developed a machine-learning based method, trained, and validated with accredited experts from our AI Echocardiography Collaborative. We make the dataset, code, and trained network freely available under an open-source license. Methods AI enables strain to be calculated without relying on speckle tracking by directly locating key points and borders across frames. Strain can then be calculated as the fractional shortening of the left ventricular perimeter. We first curated a dataset of 7523 images, including 2587 apical four chamber, each labelled by a single expert from our collaboration of 17 hospitals, using our online platform (Figure 1). Using both this dataset and a semi-supervised approach, we trained a 3d convolutional neural network to identify the annulus, apex, and the endocardial border throughout the cardiac cycle. Separately, we constructed an external validation dataset of 100 apical 4 chamber video-loops. The systolic and diastolic frame were identified, and each image was separately labelled by 11 experts. From these labels we then derived the expert consensus strain for each of the 100 video loops. These experts also ordered all 100 echocardiograms by their visual grading of left ventricular longitudinal function. Finally, a single expert calculated strain using two different proprietary commercial packages (A and B). Results Consensus strain measurements (obtained by averaging individual assessments by the 11 experts) across the 100 cases ranged from −4% to −27%, with strong correlations with the individual experts and machine methods (Figure 2). Using each cases' consensus across experts as the gold standard, median error from consensus was 3.1% for individual experts, 3.4% for Propriety A, 2.6% for Proprietary B, 2.6% for our AI. Using the visual grading of longitudinal strain as the reference, the 11 individual experts and 4 machine methods each showed significant correlation: coefficients ranged from 0.55 to 0.69 for experts, and for Proprietary A was 0.68, Proprietary B 0.69, and our AI 0.69. Conclusions Our open-source, vendor-independent AI-based strain measure automatically produces values that agree with expert consensus, as strongly as the individual experts do. It also agrees with the subjective visual ranking by longitudinal function. Our open-source AI strain performs at least as well as closed-source speckle-based approaches, and may enable increased clinical and research use of longitudinal strain. FUNDunding Acknowledgement Type of funding sources: Public grant(s) – National budget only. Main funding source(s): NIHR Imperial BRC ITMAT.Dr Howard was additionally funded by Wellcome. Figure 1. Collaborative online platform Figure 2. Correlations between strain methods
Background and purpose Artificial intelligence (AI) has the potential to greatly improve efficiency and reproducibility of quantification in echocardiography, but to gain widespread use it must both meet expert standards of excellence and have a transparent methodology. We developed an online platform to enable multiple collaborators to annotate medical images for training and validating neural networks. Methods Using our online collaborative platform 9 expert echocardiographers labelled 2056 images that comprised the training dataset. They labelled the four points from where the standard parasternal long axis (PLAX) measurements (interventricular septum, posterior wall, left ventricular dimension) would be made. Using these labelled images we trained a 2d convolutional neural network to replicate these labels. Separately, we curated an external validation dataset of the systolic and diastolic frames of 100 PLAX acquisitions. Each of these images were labelled twice by 13 different experts, and the average of the 26 measurements was taken as the consensus standard. We then compared the individual experts and the AI measurements on the external validation dataset to the consensus standard, and calculated the precision standard deviation (SD) of the signed differences from the consensus standard. Results For diastolic septum thickness, the AI had a precision SD of 1.8 mm (ICC 0.81; 95% CI 0.73 to 0.97), compared with 2.0 mm for the individual experts (ICC 0.64; 95% CI 0.57 to 0.72). For diastolic posterior wall thickness, the AI had a precision SD 1.4 mm (ICC 0.54; 95% CI 0.38 to 0.66), and the individual experts 2.2 mm (ICC 0.37; 95% CI 0.29 to 0.46). The AI's precision SD for left ventricular internal dimension was 3.5 mm (ICC 0.93, 95% CI 0.90 to 0.94), and for individual experts was 4.4mm (ICC 0.82, 95% CI 0.78 to 0.95). Both the experts and AI performed better in diastole than systole (precision SD AI 2.5mm vs 4.3mm, p<0.0001; experts 3.3mm vs 5.3mm, p<0.0001). Conclusions AI trained by a group of echocardiography experts was able to perform PLAX measurements which matched the reference standard more closely than any individual expert's own measurements. This open, collaborative approach may be a model for the development of AI that is explainable to, and trusted by clinicians. FUNDunding Acknowledgement Type of funding sources: Public grant(s) – National budget only. Main funding source(s): NIHR Imperil BRC ITMATDr Howard was additionally funded by Wellcome. Online collaborative platform Results of AI and experts
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.