Abstract:The outbreak of coronavirus disease has been a nightmare to citizens, hospitals, healthcare practitioners, and the economy in 2020. The overwhelming number of confirmed cases and suspected cases put forward an unprecedented challenge to the hospital's capacity of management and medical resource distribution. To reduce the possibility of crossinfection and attend a patient according to his severity level, expertly diagnosis and sophisticated medical examinations are often required but hard to fulfil during a p… Show more
“…The same model behavior was observed in the attention ResNet by Zhou et al. that registered a 69.1% R 2 score [41] . Compared to these SC-attention modules, the proposed Squeeze-Channel attention layers form a fusion of various feature scales resulting in a global encoded representation.…”
Section: Resultssupporting
confidence: 74%
“… [37] Encoder-Decoder CNN 0.636 0.457 0.209 0.206 5 Naeem et al. [30] CNN-LSTM autoencoder on SIFT, GIST features 0.684 0.441 0.195 0.190 6 Zhou et al [41] Spatial channel attention residual network 0.691 0.437 0.191 0.188 7 Mohammed et al. [42] Spatial channel attention CNN-LSTM 0.720 0.427 0.183 0.182 8 Chatzitofis et al.…”
Section: Resultsmentioning
confidence: 99%
“…Zhou et al. proposed an attention-based multi-modality feature fusion learning for severity prediction [41] . Mohammed et al.…”
“…The same model behavior was observed in the attention ResNet by Zhou et al. that registered a 69.1% R 2 score [41] . Compared to these SC-attention modules, the proposed Squeeze-Channel attention layers form a fusion of various feature scales resulting in a global encoded representation.…”
Section: Resultssupporting
confidence: 74%
“… [37] Encoder-Decoder CNN 0.636 0.457 0.209 0.206 5 Naeem et al. [30] CNN-LSTM autoencoder on SIFT, GIST features 0.684 0.441 0.195 0.190 6 Zhou et al [41] Spatial channel attention residual network 0.691 0.437 0.191 0.188 7 Mohammed et al. [42] Spatial channel attention CNN-LSTM 0.720 0.427 0.183 0.182 8 Chatzitofis et al.…”
Section: Resultsmentioning
confidence: 99%
“…Zhou et al. proposed an attention-based multi-modality feature fusion learning for severity prediction [41] . Mohammed et al.…”
“…The proposed methods outperformed the conventional CCA methods by learning the non-linear features and the supervised correlated space. Zhou et al [39] designed two similarity losses to enforce the learning of modality-shared information. Specifically, a cosine similarity loss was used to supervise the features learned from these two modalities, and a loss of hetero-center distance was designed to penalize the distance between the center of clinical features and CT features belonging to each class.…”
Section: Subspace-based Fusion Methodsmentioning
confidence: 99%
“…Radiology imaging supports medical decisions by providing visible image contrasts inside the human body with radiant energy, including MRI, CT, positron emission tomography (PET), fMRI and x-ray, etc. To embed the intensity standardized 2D or 3D radiology images into feature representations with learning-based encoders [16,24,[34][35][36]87] or conventional radiomics methods [24,34,35] or both [34,35], skull-stripping [38], affine registration [38], foreground extraction [39], lesion segmentation [20,34,35,38] were used correspondingly in some reviewed works to define the ROIs at first. And then, the images were resized or cropped to a smaller size for feature extraction.…”
The rapid development of diagnostic technologies in healthcare is leading to higher requirements for physicians to handle and integrate the heterogeneous, yet complementary data that are produced during routine practice. For instance, the personalized diagnosis and treatment planning for a single cancer patient relies on various image (e.g., radiological, pathological and camera image) and non-image data (e.g., clinical data andgenomic data). However, such decision-making procedures can be subjective, qualitative, and have large inter-subject variabilities. With the recent advances in multi-modal deep learning technologies, an increasingly large number of efforts have been devoted to a key question: how do we extract and aggregate multi-modal information to ultimately provide more objective, quantitative computer-aided clinical decision making? This paper reviews the recent studies on dealing with such a question. Briefly, this review will include the (1) overview of current multi-modal learning workflows, (2) summarization of multi-modal fusion methods, (3) discussion of the performance, (4) applications in disease diagnosis and prognosis, and (5) challenges and future directions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.