Cohesive Multi-Modality Feature Learning and Fusion for COVID-19 Patient Severity Prediction

Zhou, Jinzhao; Zhang, Xingming; Zhu, Ziwei; Lan, Xiangyuan; Fu, Lunkai; Wang, Haoxiang; Wen, Huang

doi:10.1109/tcsvt.2021.3063952

Cited by 19 publications

(11 citation statements)

References 56 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The same model behavior was observed in the attention ResNet by Zhou et al. that registered a 69.1% R 2 score [41] . Compared to these SC-attention modules, the proposed Squeeze-Channel attention layers form a fusion of various feature scales resulting in a global encoded representation.…”

Section: Resultssupporting

confidence: 74%

“… [37] Encoder-Decoder CNN 0.636 0.457 0.209 0.206 5 Naeem et al. [30] CNN-LSTM autoencoder on SIFT, GIST features 0.684 0.441 0.195 0.190 6 Zhou et al [41] Spatial channel attention residual network 0.691 0.437 0.191 0.188 7 Mohammed et al. [42] Spatial channel attention CNN-LSTM 0.720 0.427 0.183 0.182 8 Chatzitofis et al.…”

Section: Resultsmentioning

confidence: 99%

“…Zhou et al. proposed an attention-based multi-modality feature fusion learning for severity prediction [41] . Mohammed et al.…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

CT-based severity assessment for COVID-19 using weakly supervised non-local CNN

Karthik

Menaka

Hariharan³

et al. 2022

Applied Soft Computing

View full text Add to dashboard Cite

Section: Resultssupporting

confidence: 74%

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

CT-based severity assessment for COVID-19 using weakly supervised non-local CNN

Karthik

Menaka

Hariharan³

et al. 2022

Applied Soft Computing

View full text Add to dashboard Cite

“…The proposed methods outperformed the conventional CCA methods by learning the non-linear features and the supervised correlated space. Zhou et al [39] designed two similarity losses to enforce the learning of modality-shared information. Specifically, a cosine similarity loss was used to supervise the features learned from these two modalities, and a loss of hetero-center distance was designed to penalize the distance between the center of clinical features and CT features belonging to each class.…”

Section: Subspace-based Fusion Methodsmentioning

confidence: 99%

“…Radiology imaging supports medical decisions by providing visible image contrasts inside the human body with radiant energy, including MRI, CT, positron emission tomography (PET), fMRI and x-ray, etc. To embed the intensity standardized 2D or 3D radiology images into feature representations with learning-based encoders [16,24,[34][35][36]87] or conventional radiomics methods [24,34,35] or both [34,35], skull-stripping [38], affine registration [38], foreground extraction [39], lesion segmentation [20,34,35,38] were used correspondingly in some reviewed works to define the ROIs at first. And then, the images were resized or cropped to a smaller size for feature extraction.…”

Section: Radiology Imagesmentioning

confidence: 99%

Deep multimodal fusion of image and non-image data in disease diagnosis and prognosis: a review

et al. 2023

View full text Add to dashboard Cite

The rapid development of diagnostic technologies in healthcare is leading to higher requirements for physicians to handle and integrate the heterogeneous, yet complementary data that are produced during routine practice. For instance, the personalized diagnosis and treatment planning for a single cancer patient relies on various image (e.g., radiological, pathological and camera image) and non-image data (e.g., clinical data andgenomic data). However, such decision-making procedures can be subjective, qualitative, and have large inter-subject variabilities. With the recent advances in multi-modal deep learning technologies, an increasingly large number of efforts have been devoted to a key question: how do we extract and aggregate multi-modal information to ultimately provide more objective, quantitative computer-aided clinical decision making? This paper reviews the recent studies on dealing with such a question. Briefly, this review will include the (1) overview of current multi-modal learning workflows, (2) summarization of multi-modal fusion methods, (3) discussion of the performance, (4) applications in disease diagnosis and prognosis, and (5) challenges and future directions.

show abstract