Objectives-To create a deep learning algorithm capable of video classification, using a long short-term memory (LSTM) network, to analyze collapsibility of the inferior vena cava (IVC) to predict fluid responsiveness in critically ill patients.Methods-We used a data set of IVC ultrasound (US) videos to train the LSTM network. The data set was created from IVC US videos of spontaneously breathing critically ill patients undergoing intravenous fluid resuscitation as part of 2 prior prospective studies. We randomly selected 90% of the IVC videos to train the LSTM network and 10% of the videos to test the LSTM network's ability to predict fluid responsiveness. Fluid responsiveness was defined as a greater than 10% increase in the cardiac index after a 500-mL fluid bolus, as measured by bioreactance.Results-We analyzed 211 videos from 175 critically ill patients: 191 to train the LSTM network and 20 to test it. Using standard data augmentation techniques, we increased our sample size from 191 to 3820 videos. Of the 175 patients, 91 (52%) were fluid responders. The LSTM network was able to predict fluid responsiveness moderately well, with an area under the receiver operating characteristic curve of 0.70 (95% confidence interval [CI], 0.43-1.00), a positive likelihood ratio of infinity, and a negative likelihood ratio of 0.3 (95% CI, 0.12-0.77). In comparison, point-of-care US experts using video review offline and manual diameter measurement via software caliper tools achieved an area under the receiver operating characteristic curve of 0.94 (95% CI, 0.83-0.99).Conclusions-We demonstrated that an LSTM network can be trained by using videos of IVC US to classify IVC collapse to predict fluid responsiveness. Our LSTM network performed moderately well given the small training cohort but worse than point-of-care US experts. Further training and testing of the LSTM network with a larger data sets is warranted.
Objectives-Little is known about optimal deep learning (DL) approaches for point-of-care ultrasound (POCUS) applications. We compared 6 popular DL architectures for POCUS cardiac image classification to determine whether an optimal DL architecture exists for future DL algorithm development in POCUS.Methods-We trained 6 convolutional neural networks (CNNs) with a range of complexities and ages (AlexNet, VGG-16, VGG-19, ResNet50, Den-seNet201, and Inception-v4). Each CNN was trained by using images of 5 typical POCUS cardiac views. Images were extracted from 225 publicly available deidentified POCUS cardiac videos. A total of 750,018 individual images were extracted, with 90% used for model training and 10% for cross-validation. The training time and accuracy achieved were tracked. A real-world test of the algorithms was performed on a set of 125 completely new cardiac images. Descriptive statistics, Pearson R values, and κ values were calculated for each CNN.Results-Accuracy ranged from 96% to 85.6% correct for the 6 CNNs. VGG-16, one of the oldest and simplest CNNs, performed best at 96% correct with 232 minutes to train (R = 0.97; κ = 0.95; P < .00001). The worst-performing CNN was the newer DenseNet201, with 85.6% accuracy and 429 minutes to train (R = 0.92; κ = 0.82; P < .00001).Conclusions-Six common image classification DL algorithms showed considerable variability in their accuracy and training time when trained and tested on identical data, suggesting that not all will perform optimally for POCUS DL applications. Contrary to well-established accuracies for CNNs, more modern and deeper algorithms yielded poorer results.
Objectives Deep learning for medical imaging analysis uses convolutional neural networks pretrained on ImageNet (Stanford Vision Lab, Stanford, CA). Little is known about how such color‐ and scene‐rich standard training images compare quantitatively to medical images. We sought to quantitatively compare ImageNet images to point‐of‐care ultrasound (POCUS), computed tomographic (CT), magnetic resonance (MR), and chest x‐ray (CXR) images. Methods Using a quantitative image quality assessment technique (Blind/Referenceless Image Spatial Quality Evaluator), we compared images based on pixel complexity, relationships, variation, and distinguishing features. We compared 5500 ImageNet images to 2700 CXR, 2300 CT, 1800 MR, and 18,000 POCUS images. Image quality results ranged from 0 to 100 (worst). A 1‐way analysis of variance was performed, and the standardized mean‐difference effect size value (d) was calculated. Results ImageNet images showed the best image quality rating of 21.7 (95% confidence interval [CI], 0.41) except for CXR at 13.2 (95% CI, 0.28), followed by CT at 35.1 (95% CI, 0.79), MR at 31.6 (95% CI, 0.75), and POCUS at 56.6 (95% CI, 0.21). The differences between ImageNet and all of the medical images were statistically significant (P ≤ .000001). The greatest difference in image quality was between ImageNet and POCUS (d = 2.38). Conclusions Point‐of‐care ultrasound (US) quality is significantly different from that of ImageNet and other medical images. This brings considerable implications for convolutional neural network training with medical images for various applications, which may be even more significant in the case of US images. Ultrasound deep learning developers should consider pretraining networks from scratch on US images, as training techniques used for CT, CXR, and MR images may not apply to US.
Objectives We sought to create a deep learning algorithm to determine the degree of inferior vena cava (IVC) collapsibility in critically ill patients to enable novice point‐of‐care ultrasound (POCUS) providers. Methods We used publicly available long short term memory (LSTM) deep learning basic architecture that can track temporal changes and relationships in real‐time video, to create an algorithm for ultrasound video analysis. The algorithm was trained on public domain IVC ultrasound videos to improve its ability to recognize changes in varied ultrasound video. A total of 220 IVC videos were used, 10% of the data was randomly used for cross correlation during training. Data were augmented through video rotation and manipulation to multiply effective training data quantity. After training, the algorithm was tested on the 50 new IVC ultrasound video obtained from public domain sources and not part of the data set used in training or cross validation. Fleiss’ κ was calculated to compare level of agreement between the 3 POCUS experts and between deep learning algorithm and POCUS experts. Results There was very substantial agreement between the 3 POCUS experts with κ = 0.65 (95% CI = 0.49–0.81). Agreement between experts and algorithm was moderate with κ = 0.45 (95% CI = 0.33–0.56). Conclusions Our algorithm showed good agreement with POCUS experts in visually estimating degree of IVC collapsibility that has been shown in previously published studies to differentiate fluid responsive from fluid unresponsive septic shock patients. Such an algorithm could be adopted to run in real‐time on any ultrasound machine with a video output, easing the burden on novice POCUS users by limiting their task to obtaining and maintaining a sagittal proximal IVC view and allowing the artificial intelligence make real‐time determinations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.