To evaluate the performance of a deep learning-based algorithm for automatic detection and labeling of rib fractures from multicenter chest CT images. Materials and Methods: This retrospective study included 10 943 patients (mean age, 55 years; 6418 men) from six hospitals (January 1, 2017 to December 30, 2019), which consisted of patients with and without rib fractures who underwent CT. The patients were separated into one training set (n = 2425), two lesion-level test sets (n = 362 and 105), and one examination-level test set (n = 8051).Free-response receiver operating characteristic (FROC) score (mean sensitivity of seven different false-positive rates), precision, sensitivity, and F1 score were used as metrics to assess rib fracture detection performance. Area under the receiver operating characteristic curve (AUC), sensitivity, and specificity were employed to evaluate the classification accuracy. The mean Dice coefficient and accuracy were used to assess the performance of rib labeling. Results:In the detection of rib fractures, the model showed an FROC score of 84.3% on test set 1.
With the urgent demand for generalized deep models, many pre-trained big models are proposed, such as bidirectional encoder representations (BERT), vision transformer (ViT), generative pre-trained transformers (GPT), etc. Inspired by the success of these models in single domains (like computer vision and natural language processing), the multi-modal pre-trained big models have also drawn more and more attention in recent years. In this work, we give a comprehensive survey of these models and hope this paper could provide new insights and helps fresh researchers to track the most cutting-edge works. Specifically, we firstly introduce the background of multi-modal pre-training by reviewing the conventional deep learning, pre-training works in natural language process, computer vision, and speech. Then, we introduce the task definition, key challenges, and advantages of multi-modal pre-training models (MM-PTMs), and discuss the MM-PTMs with a focus on data, objectives, network architectures, and knowledge enhanced pre-training. After that, we introduce the downstream tasks used for the validation of large-scale MM-PTMs, including generative, classification, and regression tasks. We also give visualization and analysis of the model parameters and results on representative downstream tasks. Finally, we point out possible research directions for this topic that may benefit future works. In addition, we maintain a continuously updated paper list for large-scale pre-trained multi-modal big models: https://github.com/wangxiao5791509/MultiModal_BigModels_Survey.
With the urgent demand for generalized deep models, many pre-trained big models are proposed, such as BERT, ViT, GPT, etc. Inspired by the success of these models in single domains (like computer vision and natural language processing), the multi-modal pre-trained big models have also drawn more and more attention in recent years. In this work, we give a comprehensive survey of these models and hope this paper could provide new insights and helps fresh researchers to track the most cuttingedge works. Specifically, we firstly introduce the background of multi-modal pre-training by reviewing the conventional deep learning, pre-training works in natural language process, computer vision, and speech. Then, we introduce the task definition, key challenges, and advantages of multi-modal pretraining models (MM-PTMs), and discuss the MM-PTMs with a focus on data, objectives, network architectures, and knowledge enhanced pre-training. After that, we introduce the downstream tasks used for the validation of large-scale MM-PTMs, including generative, classification, and regression tasks. We also give visualization and analysis of the model parameters and results on representative downstream tasks. Finally, we point out possible research directions for this topic that may benefit future works. In addition, we maintain a continuously updated paper list for large-scale pre-trained multi-modal big models: https://github.com/wangxiao5791509/MultiModal BigModels Survey.
IntroductionEarly breast carcinomas can be effectively diagnosed and controlled. However, it demands extra work and radiologist in China often suffer from overtime working due to too many patients, even experienced ones could make mistakes after overloaded work. To improve the efficiency and reduce the rate of misdiagnosis, automatic breast diagnosis on Magnetic Resonance Imaging (MRI) images is vital yet challenging for breast disease screening and successful treatment planning. There are some obstacles that hinder the development of automatic approaches, such as class-imbalance of samples, hard mimics of lesions, etc. In this paper, we propose a coarse-to-fine algorithm to address those problems of automatic breast diagnosis on multi-series MRI images. The algorithm utilizes deep learning techniques to provide breast segmentation, tumor segmentation and tumor classification functions, thus supporting doctors' decisions in clinical practice.MethodsIn proposed algorithm, a DenseUNet is firstly employed to extract breast-related regions by removing irrelevant parts in the thoracic cavity. Then, by taking advantage of the attention mechanism and the focal loss, a novel network named Attention Dense UNet (ADUNet) is designed for the tumor segmentation. Particularly, the focal loss in ADUNet addresses class-imbalance and model overwhelmed problems. Finally, a customized network is developed for the tumor classification. Besides, while most approaches only consider one or two series, the proposed algorithm takes in account multiple series of MRI images.ResultsExtensive experiments are carried out to evaluate its performance on 435 multi-series MRI volumes from 87 patients collected from Tongji Hospital. In the dataset, all cases are with benign, malignant, or both type of tumors, the category of which covers carcinoma, fibroadenoma, cyst and abscess. The ground truths of tumors are labeled by two radiologists with 3 years of experience on breast MRI reporting by drawing contours of tumor slice by slice. ADUNet is compared with other prevalent deep-learning methods on the tumor segmentation and quantitative results, and achieves the best performance on both Case Dice Score and Global Dice Score by 0.748 and 0.801 respectively. Moreover, the customized classification network outperforms two CNN-M based models and achieves tumor-level and case-level AUC by 0.831 and 0.918 respectively.DiscussionAll data in this paper are collected from the same MRI device, thus it is reasonable to assume that they are from the same domain and independent identically distributed. Whether the proposed algorithm is robust enough in a multi-source case still remains an open question. Each stage of the proposed algorithm is trained separately, which makes each stage more robust and converge faster. Such training strategy considers each stage as a separate task and does not take into account the relationships between tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.