A novel multimodal fusion framework for early diagnosis and accurate classification of COVID-19 patients using X-ray images and speech signal processing techniques
“…From the obtained results, the authors concluded that there is a considerable improvement in COVID-19 detection. Similar studies can be found in [26,29,44,59,63].…”
Section: Covid-19 Firstly Reported In 2019 In Wuhansupporting
confidence: 87%
“…Among the studied methods, cough sounds are the most used modality. They are usually passed through several pre-processing steps, such as noise reduction [29] or cough detection [63]. The second most modalities are breathing and clinical symptoms.…”
Section: Covid-19 Firstly Reported In 2019 In Wuhanmentioning
COVID-19 (Coronavirus Disease of 2019) is one of the most challenging healthcare crises of the twenty-first century. The pandemic causes many negative impacts on all aspects of life and livelihoods. Although recent developments of relevant vaccines, such as Pfizer/BioNTech mRNA, AstraZeneca, or Moderna, the emergence of new virus mutations and their fast infection rate yet pose significant threats to public health. In this context, early detection of the disease is an important factor to reduce its effect and quickly control the spread of pandemic. Nevertheless, many countries still rely on methods that are either expensive and time-consuming (i.e., Reverse-transcription polymerase chain reaction) or uncomfortable and difficult for self-testing (i.e., Rapid Antigen Test Nasal). Recently, deep learning methods have been proposed as a potential solution for COVID-19 analysis. However, previous works usually focus on a single symptom, which can omit critical information for disease diagnosis. Therefore, in this study, we propose a multi-modal method to detect COVID-19 using cough sounds and self-reported symptoms. The proposed method consists of five neural networks to deal with different input features, including CNN-biLSTM for MFCC features, EfficientNetV2 for Mel spectrogram images, MLP for self-reported symptoms, C-YAMNet for cough detection, and RNNoise for noise-canceling. Experimental results demonstrated that our method outperformed the other state-of-the-art methods with a high AUC, accuracy, and F1-score of 98.6%, 96.9%, and 96.9% on the testing set.
“…From the obtained results, the authors concluded that there is a considerable improvement in COVID-19 detection. Similar studies can be found in [26,29,44,59,63].…”
Section: Covid-19 Firstly Reported In 2019 In Wuhansupporting
confidence: 87%
“…Among the studied methods, cough sounds are the most used modality. They are usually passed through several pre-processing steps, such as noise reduction [29] or cough detection [63]. The second most modalities are breathing and clinical symptoms.…”
Section: Covid-19 Firstly Reported In 2019 In Wuhanmentioning
COVID-19 (Coronavirus Disease of 2019) is one of the most challenging healthcare crises of the twenty-first century. The pandemic causes many negative impacts on all aspects of life and livelihoods. Although recent developments of relevant vaccines, such as Pfizer/BioNTech mRNA, AstraZeneca, or Moderna, the emergence of new virus mutations and their fast infection rate yet pose significant threats to public health. In this context, early detection of the disease is an important factor to reduce its effect and quickly control the spread of pandemic. Nevertheless, many countries still rely on methods that are either expensive and time-consuming (i.e., Reverse-transcription polymerase chain reaction) or uncomfortable and difficult for self-testing (i.e., Rapid Antigen Test Nasal). Recently, deep learning methods have been proposed as a potential solution for COVID-19 analysis. However, previous works usually focus on a single symptom, which can omit critical information for disease diagnosis. Therefore, in this study, we propose a multi-modal method to detect COVID-19 using cough sounds and self-reported symptoms. The proposed method consists of five neural networks to deal with different input features, including CNN-biLSTM for MFCC features, EfficientNetV2 for Mel spectrogram images, MLP for self-reported symptoms, C-YAMNet for cough detection, and RNNoise for noise-canceling. Experimental results demonstrated that our method outperformed the other state-of-the-art methods with a high AUC, accuracy, and F1-score of 98.6%, 96.9%, and 96.9% on the testing set.
“…In the case of diabetics, for example, it is possible to determine the individual risk with the help of models ( 45 ). Furthermore, modern technologies can be useful in early diagnosis and accurate classification of COVID-19 patients ( 46 ) and combat COVID-19 ( 47 , 48 ).…”
BackgroundDuring the COVID-19 pandemic, protective measures have been prescribed to prevent or slow down the spread of the SARS-CoV-2 virus and protect the population. Individuals follow these measures to varying degrees. We aimed to identify factors influencing the extent to which protective measures are adhered to.MethodsA cross-sectional survey (telephone interviews) was undertaken between April and June 2021 to identify factors influencing the degree to which individuals adhere to protective measures. A representative sample of 1,003 people (age >16 years) in two Austrian states (Carinthia, Vorarlberg) was interviewed. The questionnaire was based on the Health Belief Model, but also included potential response-modifying factors. Predictors for adherent behavior were identified using multiple regression analysis. All predictors were standardized so that regression coefficients (β) could be compared.ResultsOverall median adherence was 0.75 (IQR: 0.5–1.0). Based on a regression model, the following variables were identified as significant in raising adherence: higher age (β = 0.43, 95%CI: 0.33–0.54), social standards of acceptable behavior (β = 0.33, 95%CI: 0.27–0.40), subjective/individual assessment of an increased personal health risk (β = 0.12, 95%CI: 0.05–0.18), self-efficacy (β = 0.06, 95%CI: 0.02–0.10), female gender (β = 0.05, 95%CI: 0.01–0.08), and low corona fatigue (behavioral fatigue: β = −0.11, 95%CI: −0.18 to −0.03). The model showed that such aspects as personal trust in institutions, perceived difficulties in adopting health-promoting measures, and individual assessments of the risk of infection, had no significant influence.ConclusionsThis study reveals that several factors significantly influence adherence to measures aimed at controlling the COVID-19 pandemic. To enhance adherence, the government, media, and other relevant stakeholders should take the findings into consideration when formulating policy. By developing social standards and promoting self-efficacy, individuals can influence the behavior of others and contribute toward coping with the pandemic.
“…From machine learning with chemical property descriptors to deep learning with primitives, models are becoming more accurate and generalized than ever before. − ,− Recently, multimodal deep learnings have began to flourish, because of the advancement of diversified information acquisition algorithms . Since the representation of a target can be extracted from various expression forms, such as images, semantic sequence, spatial network, etc., multimodal models making full use of multiform inputs exhibit superiority in disease diagnosis, attack detection, semantic interpretability analysis, etc. All these types of machine learning models have been used for predicting NCIs as well. ,, However, the capabilities of the NCI prediction modelsin particular, their robustness, generalization, and interpretabilityare still far from adequate.…”
Section: Introductionmentioning
confidence: 99%
“…32−35,37−39 Recently, multimodal deep learnings have began to flourish, because of the advancement of diversified information acquisition algorithms. 40 Since the representation of a target can be extracted from various expression forms, such as images, semantic sequence, spatial network, etc., multimodal models making full use of multiform inputs exhibit superiority in disease diagnosis, 41 attack detection, 42 semantic interpretability analysis, 43 etc. All these types of machine learning models have been used for predicting NCIs as well.…”
The interpretability is an important issue for end-to-end
learning
models. Motivated by computer vision algorithms, an interpretable
noncovalent interaction (NCI) correction multimodal (TFRegNCI) is
proposed for NCI prediction. TFRegNCI is based on RegNet feature extraction
and a transformer encoder fusion strategy. RegNet is a network design
paradigm that mainly focuses on local features. Meanwhile, the Vision
Transformer is also leveraged for feature extraction, because it can
capture global features better than RegNet while lowering the computational
cost. Using a transformer encoder as the fusion strategy rather than
multilayer perceptron can enhance model performance, due to its emphasis
on important features with less parameters. Therefore, the proposed
TFRegNCI achieved high accurate prediction (mean absolute error of
∼0.1 kcal/mol) comparing with the coupled cluster single double
(triple) (CCSD(T)) benchmark. To further improve the model efficiency,
TFRegNCI applies two-dimensional (2D) inputs transformed from three-dimensional
(3D) electron density cubes, which saves time (30%), while the model
accuracy remains. To improve model interpretability, a visualization
module, Gradient-weighted Regression Activation Mapping (Grad-RAM)
has been embedded. Grad-RAM is promoted from the classification algorithm,
Gradient-weighted Class Activation Mapping, to perform feature visualization
for the regression task. With Grad-RAM, the visual location map for
features in deep learning models can be displayed. The feature map
visualizations suggest that the 2D model has the similar performance
as the 3D model, because of equally effective feature extractions
from electron density. Moreover, the valid feature region on the location
map by the 3D model is consistent with the NCIPLOT NCI isosurface.
It is confirmed that the model does extract significant features related
to the NCI interaction. The interpretable analyses are carried out
through molecular orbital contribution on effective features. Thereby,
the proposed model is likely to be a promising tool to reveal some
essential information on NCIs, with regard to the level of electronic
theory.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.