Transfer Learning and Data Augmentation Techniques to the COVID-19 Identification Tasks in ComParE 2021

Casanova, Edresson; Cândido, Arnaldo; Fernandes, Roshan; Finger, Marcelo; Gris, Lucas Rafael Stefanel; Ponti, Moacir Antonelli; Silva, D. P. P. Da

doi:10.21437/interspeech.2021-1798

Cited by 15 publications

(18 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Coppock et al (2022) presented a summary of the INTERSPEECH 2021 ComParE. A cough and speech UAR of 75.9% (Casanova et al, 2021) and 72.1% (Schuller et al, 2021) was achieved, respectively. Although we reached a slightly lower UAR for cough (UAR = 75.54%), it is worth noting that we did not use any data augmentation and deep learning methods.…”

Section: Discussionmentioning

confidence: 99%

“…Unlike for cough, we did not achieve a good performance for speech tasks compared to the baseline shown by Schuller et al (2021) (UAR = 72.1%). Casanova et al (2021), when exploring the same approach utilized for cough, had achieved a UAR of 70.3%. Klumpp et al (2021) explored Mel spectrograms and various classifiers, such as LSTM, CNN, SVM, and LR, with data augmentation, and a UAR of 64.2% was reached.…”

Section: Discussionmentioning

confidence: 99%

“…In the same direction, various research studies employing respiratory sounds were conducted for COVID-19 screening (Brown et al, 2020;Casanova et al, 2021;Schuller et al, 2021;Verde et al, 2021;Pahar et al, 2022;Pleva et al, 2022;Sharma et al, 2022;Villa-Parra et al, 2022). Cough and breathing sounds from COVID-19, asthmatic, and healthy individuals were utilized by Brown et al (2020).…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

COVID-19 respiratory sound analysis and classification using audio textures

Silva

Valadão²,

Lampier³

et al. 2022

Front. Signal Process.

View full text Add to dashboard Cite

Since the COVID-19 outbreak, a major scientific effort has been made by researchers and companies worldwide to develop a digital diagnostic tool to screen this disease through some biomedical signals, such as cough, and speech. Joint time–frequency feature extraction techniques and machine learning (ML)-based models have been widely explored in respiratory diseases such as influenza, pertussis, and COVID-19 to find biomarkers from human respiratory system-generated acoustic sounds. In recent years, a variety of techniques for discriminating textures and computationally efficient local texture descriptors have been introduced, such as local binary patterns and local ternary patterns, among others. In this work, we propose an audio texture analysis of sounds emitted by subjects in suspicion of COVID-19 infection using time–frequency spectrograms. This approach of the feature extraction method has not been widely used for biomedical sounds, particularly for COVID-19 or respiratory diseases. We hypothesize that this textural sound analysis based on local binary patterns and local ternary patterns enables us to obtain a better classification model by discriminating both people with COVID-19 and healthy subjects. Cough, speech, and breath sounds from the INTERSPEECH 2021 ComParE and Cambridge KDD databases have been processed and analyzed to evaluate our proposed feature extraction method with ML techniques in order to distinguish between positive or negative for COVID-19 sounds. The results have been evaluated in terms of an unweighted average recall (UAR). The results show that the proposed method has performed well for cough, speech, and breath sound classification, with a UAR up to 100.00%, 60.67%, and 95.00%, respectively, to infer COVID-19 infection, which serves as an effective tool to perform a preliminary screening of COVID-19.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

COVID-19 respiratory sound analysis and classification using audio textures

Silva

Valadão²,

Lampier³

et al. 2022

Front. Signal Process.

View full text Add to dashboard Cite

show abstract

“…in [13,14,15]. In [13], the authors proposed an ensemble of CNN classifiers from different acoustic features.…”

Section: Related Workmentioning

confidence: 99%

Cross-dataset COVID-19 Transfer Learning with Cough Detection, Cough Segmentation, and Data Augmentation

Atmaja¹,

Zanjabila²,

Suyanto³

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Ko et al [4] evaluated a low-cost data augmentation technique by speed perturbation and found it improved the word error rate (WER) over other methods. Casanova et al [5] employed both transfer learning and data augmentation for improving COVID-19 detection from cough sounds. Similarly, data augmentation could improve SER performances in specific ways.…”

Section: Introductionmentioning

confidence: 99%

Effects of Data Augmentations on Speech Emotion Recognition

Atmaja¹

2022

Preprint

View full text Add to dashboard Cite

Data augmentation techniques recently gained more adoption in speech processing, including speech emotion recognition. Although more data tends to be more effective, there may be a trade-off in which more data will not provide a better model. This paper reports experiments on investigating the effects of data augmentation in speech emotion recognition. The investigation aims at finding the most useful type of data augmentation and the number of data augmentations for speech emotion recognition. The experiments are conducted on the Japanese Twitter-based emotional speech corpus. The results show that for speaker-independent data, two data augmentations with glottal source extraction and silence removal exhibited the best performance among others, even with more data augmentation techniques. For the text-independent data (including speaker and text-independent), more data augmentations tend to improve speech emotion recognition performances. The results highlight the trade-off between the number of data augmentation and the performance of speech emotion recognition showing the necessity to choose a proper data augmentation technique for a specific application.

show abstract

Transfer Learning and Data Augmentation Techniques to the COVID-19 Identification Tasks in ComParE 2021

Cited by 15 publications

References 0 publications

COVID-19 respiratory sound analysis and classification using audio textures

COVID-19 respiratory sound analysis and classification using audio textures

Cross-dataset COVID-19 Transfer Learning with Cough Detection, Cough Segmentation, and Data Augmentation

Effects of Data Augmentations on Speech Emotion Recognition

Contact Info

Product

Resources

About