A novel multimodal fusion framework for early diagnosis and accurate classification of COVID-19 patients using X-ray images and speech signal processing techniques

Kumar, Santosh; Chaube, Mithileh Kumar; Alsamhi, Saeed Hamood; Gupta, Sachin Kumar; Guizani, Mohsen; Gravina, Raffaele; Fortino, Giancarlo

doi:10.1016/j.cmpb.2022.107109

Cited by 47 publications

(19 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…From the obtained results, the authors concluded that there is a considerable improvement in COVID-19 detection. Similar studies can be found in [26,29,44,59,63].…”

Section: Covid-19 Firstly Reported In 2019 In Wuhansupporting

confidence: 87%

See 1 more Smart Citation

Multi-modal approach for COVID-19 detection using coughs and self-reported symptoms

Nguyen-Trong¹,

Nguyen-Hoang²

2023

IFS

View full text Add to dashboard Cite

COVID-19 (Coronavirus Disease of 2019) is one of the most challenging healthcare crises of the twenty-first century. The pandemic causes many negative impacts on all aspects of life and livelihoods. Although recent developments of relevant vaccines, such as Pfizer/BioNTech mRNA, AstraZeneca, or Moderna, the emergence of new virus mutations and their fast infection rate yet pose significant threats to public health. In this context, early detection of the disease is an important factor to reduce its effect and quickly control the spread of pandemic. Nevertheless, many countries still rely on methods that are either expensive and time-consuming (i.e., Reverse-transcription polymerase chain reaction) or uncomfortable and difficult for self-testing (i.e., Rapid Antigen Test Nasal). Recently, deep learning methods have been proposed as a potential solution for COVID-19 analysis. However, previous works usually focus on a single symptom, which can omit critical information for disease diagnosis. Therefore, in this study, we propose a multi-modal method to detect COVID-19 using cough sounds and self-reported symptoms. The proposed method consists of five neural networks to deal with different input features, including CNN-biLSTM for MFCC features, EfficientNetV2 for Mel spectrogram images, MLP for self-reported symptoms, C-YAMNet for cough detection, and RNNoise for noise-canceling. Experimental results demonstrated that our method outperformed the other state-of-the-art methods with a high AUC, accuracy, and F1-score of 98.6%, 96.9%, and 96.9% on the testing set.

show abstract

“…From the obtained results, the authors concluded that there is a considerable improvement in COVID-19 detection. Similar studies can be found in [26,29,44,59,63].…”

Section: Covid-19 Firstly Reported In 2019 In Wuhansupporting

confidence: 87%

“…Among the studied methods, cough sounds are the most used modality. They are usually passed through several pre-processing steps, such as noise reduction [29] or cough detection [63]. The second most modalities are breathing and clinical symptoms.…”

Section: Covid-19 Firstly Reported In 2019 In Wuhanmentioning

confidence: 99%

Multi-modal approach for COVID-19 detection using coughs and self-reported symptoms

Nguyen-Trong¹,

Nguyen-Hoang²

2023

IFS

View full text Add to dashboard Cite

show abstract

“…In the case of diabetics, for example, it is possible to determine the individual risk with the help of models ( 45 ). Furthermore, modern technologies can be useful in early diagnosis and accurate classification of COVID-19 patients ( 46 ) and combat COVID-19 ( 47 , 48 ).…”

Section: Discussionmentioning

confidence: 99%

Predictors for adherent behavior in the COVID-19 pandemic: A cross-sectional telephone survey

Siebenhofer

Könczöl²,

Jeitler³

et al. 2022

Front. Public Health

View full text Add to dashboard Cite

BackgroundDuring the COVID-19 pandemic, protective measures have been prescribed to prevent or slow down the spread of the SARS-CoV-2 virus and protect the population. Individuals follow these measures to varying degrees. We aimed to identify factors influencing the extent to which protective measures are adhered to.MethodsA cross-sectional survey (telephone interviews) was undertaken between April and June 2021 to identify factors influencing the degree to which individuals adhere to protective measures. A representative sample of 1,003 people (age >16 years) in two Austrian states (Carinthia, Vorarlberg) was interviewed. The questionnaire was based on the Health Belief Model, but also included potential response-modifying factors. Predictors for adherent behavior were identified using multiple regression analysis. All predictors were standardized so that regression coefficients (β) could be compared.ResultsOverall median adherence was 0.75 (IQR: 0.5–1.0). Based on a regression model, the following variables were identified as significant in raising adherence: higher age (β = 0.43, 95%CI: 0.33–0.54), social standards of acceptable behavior (β = 0.33, 95%CI: 0.27–0.40), subjective/individual assessment of an increased personal health risk (β = 0.12, 95%CI: 0.05–0.18), self-efficacy (β = 0.06, 95%CI: 0.02–0.10), female gender (β = 0.05, 95%CI: 0.01–0.08), and low corona fatigue (behavioral fatigue: β = −0.11, 95%CI: −0.18 to −0.03). The model showed that such aspects as personal trust in institutions, perceived difficulties in adopting health-promoting measures, and individual assessments of the risk of infection, had no significant influence.ConclusionsThis study reveals that several factors significantly influence adherence to measures aimed at controlling the COVID-19 pandemic. To enhance adherence, the government, media, and other relevant stakeholders should take the findings into consideration when formulating policy. By developing social standards and promoting self-efficacy, individuals can influence the behavior of others and contribute toward coping with the pandemic.

show abstract

“…From machine learning with chemical property descriptors to deep learning with primitives, models are becoming more accurate and generalized than ever before. − ,− Recently, multimodal deep learnings have began to flourish, because of the advancement of diversified information acquisition algorithms . Since the representation of a target can be extracted from various expression forms, such as images, semantic sequence, spatial network, etc., multimodal models making full use of multiform inputs exhibit superiority in disease diagnosis, attack detection, semantic interpretability analysis, etc. All these types of machine learning models have been used for predicting NCIs as well. ,, However, the capabilities of the NCI prediction modelsin particular, their robustness, generalization, and interpretabilityare still far from adequate.…”

Section: Introductionmentioning

confidence: 99%

“…32−35,37−39 Recently, multimodal deep learnings have began to flourish, because of the advancement of diversified information acquisition algorithms. 40 Since the representation of a target can be extracted from various expression forms, such as images, semantic sequence, spatial network, etc., multimodal models making full use of multiform inputs exhibit superiority in disease diagnosis, 41 attack detection, 42 semantic interpretability analysis, 43 etc. All these types of machine learning models have been used for predicting NCIs as well.…”

Section: Introductionmentioning

confidence: 99%

TFRegNCI: Interpretable Noncovalent Interaction Correction Multimodal Based on Transformer Encoder Fusion

Wang

et al. 2023

J. Chem. Inf. Model.

View full text Add to dashboard Cite

The interpretability is an important issue for end-to-end learning models. Motivated by computer vision algorithms, an interpretable noncovalent interaction (NCI) correction multimodal (TFRegNCI) is proposed for NCI prediction. TFRegNCI is based on RegNet feature extraction and a transformer encoder fusion strategy. RegNet is a network design paradigm that mainly focuses on local features. Meanwhile, the Vision Transformer is also leveraged for feature extraction, because it can capture global features better than RegNet while lowering the computational cost. Using a transformer encoder as the fusion strategy rather than multilayer perceptron can enhance model performance, due to its emphasis on important features with less parameters. Therefore, the proposed TFRegNCI achieved high accurate prediction (mean absolute error of ∼0.1 kcal/mol) comparing with the coupled cluster single double (triple) (CCSD(T)) benchmark. To further improve the model efficiency, TFRegNCI applies two-dimensional (2D) inputs transformed from three-dimensional (3D) electron density cubes, which saves time (30%), while the model accuracy remains. To improve model interpretability, a visualization module, Gradient-weighted Regression Activation Mapping (Grad-RAM) has been embedded. Grad-RAM is promoted from the classification algorithm, Gradient-weighted Class Activation Mapping, to perform feature visualization for the regression task. With Grad-RAM, the visual location map for features in deep learning models can be displayed. The feature map visualizations suggest that the 2D model has the similar performance as the 3D model, because of equally effective feature extractions from electron density. Moreover, the valid feature region on the location map by the 3D model is consistent with the NCIPLOT NCI isosurface. It is confirmed that the model does extract significant features related to the NCI interaction. The interpretable analyses are carried out through molecular orbital contribution on effective features. Thereby, the proposed model is likely to be a promising tool to reveal some essential information on NCIs, with regard to the level of electronic theory.

show abstract

A novel multimodal fusion framework for early diagnosis and accurate classification of COVID-19 patients using X-ray images and speech signal processing techniques

Cited by 47 publications

References 21 publications

Multi-modal approach for COVID-19 detection using coughs and self-reported symptoms

Multi-modal approach for COVID-19 detection using coughs and self-reported symptoms

Predictors for adherent behavior in the COVID-19 pandemic: A cross-sectional telephone survey

TFRegNCI: Interpretable Noncovalent Interaction Correction Multimodal Based on Transformer Encoder Fusion

Contact Info

Product

Resources

About