The use of machine learning to develop intelligent software tools for the interpretation of radiology images has gained widespread attention in recent years. The development, deployment, and eventual adoption of these models in clinical practice, however, remains fraught with challenges. In this paper, we propose a list of key considerations that machine learning researchers must recognize and address to make their models accurate, robust, and usable in practice. We discuss insufficient training data, decentralized data sets, high cost of annotations, ambiguous ground truth, imbalance in class representation, asymmetric misclassification costs, relevant performance metrics, generalization of models to unseen data sets, model decay, adversarial attacks, explainability, fairness and bias, and clinical validation. We describe each consideration and identify the techniques used to address it. Although these techniques have been discussed in prior research, by freshly examining them in the context of medical imaging and compiling them in the form of a laundry list, we hope to make them more accessible to researchers, software developers, radiologists, and other stakeholders.
In this paper, we compare three privacy-preserving distributed learning techniques: federated learning, split learning, and SplitFed. We use these techniques to develop binary classification models for detecting tuberculosis from chest X-rays and compare them in terms of classification performance, communication and computational costs, and training time. We propose a novel distributed learning architecture called SplitFedv3, which performs better than split learning and SplitFedv2 in our experiments. We also propose alternate mini-batch training, a new training technique for split learning, that performs better than alternate client training, where clients take turns to train a model.
Background
Computed tomographic pulmonary angiography (CTPA) is the diagnostic standard for confirming pulmonary embolism (PE). Since PE is a life-threatening condition, early diagnosis and treatment are critical to avoid PE-associated morbidity and mortality. However, PE remains subject to misdiagnosis.
Methods
We retrospectively identified 251 CTPAs performed at a tertiary care hospital between January 2018 to January 2021. The scans were classified as positive (n = 55) and negative (n = 196) for PE based on the annotations made by board-certified radiologists. A fully anonymized CT slice served as input for the detection of PE by the 2D segmentation model comprising U-Net architecture with Xception encoder. The diagnostic performance of the model was calculated at both the scan and the slice levels.
Results
The model correctly identified 44 out of 55 scans as positive for PE and 146 out of 196 scans as negative for PE with a sensitivity of 0.80 [95% CI 0.68, 0.89], a specificity of 0.74 [95% CI 0.68, 0.80], and an accuracy of 0.76 [95% CI 0.70, 0.81]. On slice level, 4817 out of 5183 slices were marked as positive for the presence of emboli with a specificity of 0.89 [95% CI 0.88, 0.89], a sensitivity of 0.93 [95% CI 0.92, 0.94], and an accuracy of 0.89 [95% CI 0.887, 0.890]. The model also achieved an AUROC of 0.85 [0.78, 0.90] and 0.94 [0.936, 0.941] at scan level and slice level, respectively for the detection of PE.
Conclusion
The development of an AI model and its use for the identification of pulmonary embolism will support healthcare workers by reducing the rate of missed findings and minimizing the time required to screen the scans.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.