Classification of Cold and Non-Cold Speech Using Vowel-Like Region Segments

Warule, Pankaj; Mishra, Siba Prasad; Deb, Suman

doi:10.1109/spcom55316.2022.9840775

Cited by 13 publications

(4 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They achieved the highest UAR of 62.70% for the consonants group of phonemes. In our previous study [21], we achieved 61.93% UAR using the MFCC features extracted from the vowel-like region segments of speech for cold and healthy speech classification.…”

Section: Resultsmentioning

confidence: 99%

“…In stateof-the-art methods, all the frames of CAS are processed for feature extraction. In our previous study [21], we used a VLR of speech which reduces the number of frames by 50.76% for the classification of cold and healthy speech. But UAR achieved using MFCC features extracted from VLR is low compared to the state-of-the-art methods.…”

Section: Resultsmentioning

confidence: 99%

“…Vicente et al[20] developed a Fisher vector for identifying cold speech using MFCC features and a generative Gaussian mixture model. In our previous study[21], we have utilized only vowel-like region segments of speech for cold and healthy speech classification. Vowel-like regions (VLR) were separated from speech signals by identifying vowel-like region onset points and endpoints using the Hilbert envelope of linear prediction residual signal and zero-frequency filtering methods.…”

mentioning

confidence: 99%

See 2 more Smart Citations

Significance of voiced and unvoiced speech segments for the detection of common cold

Warule

Mishra

Deb

2022

SIViP

Self Cite

View full text Add to dashboard Cite

This work investigates the significance of the voiced and unvoiced region for detecting common cold from the speech signal. In literature, the entire speech signal is processed to detect the common cold and other diseases. This study uses a short-time energy-based approach to segment the voiced and unvoiced region of the speech signal. Then, frame-wise mel frequency cepstral coefficients (MFCC) features are extracted from the voiced and unvoiced segments of each speech utterance, and statistics (mean, variance, skewness, and kurtosis) are calculated to get the feature vector for each speech utterance. The support vector machine (SVM) is utilized to analyze the performance of features extracted from the voiced and unvoiced region. Result shows that the feature extracted from voiced segments, unvoiced segments, and complete active speech (CAS) gives almost similar results using the MFCC features and SVM classifier. Therefore, rather than processing the CAS, we can process the unvoiced speech segments, which have fewer frames compared to CAS and voiced regions of speech. The processing of solely unvoiced segments can reduce the time and computation complexity of a speech signal-based common cold detection system.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Resultsmentioning

confidence: 99%

mentioning

confidence: 99%

See 1 more Smart Citation

Significance of voiced and unvoiced speech segments for the detection of common cold

Warule

Mishra

Deb

2022

SIViP

Self Cite

View full text Add to dashboard Cite

show abstract

“…The MFCC is considered to be the most important characteristic of all aspects of speech signal processing, including speech pathology and speech emotion detection. The MFCCs is extracted using the principles underlying human sound perception [14][15][16][17]. The procedures involved in obtaining the MFCC are explained in Fig.…”

Section: Mel Frequency Cepstral Coefficientmentioning

confidence: 99%

Machine learning approach for detecting Covid-19 from speech signal using Mel frequency magnitude coefficient

Nayak

Darji

Shah

2023

SIViP

View full text Add to dashboard Cite

The Covid-19 pandemic is one of the most significant global health concerns that have emerged in this decade. Intelligent healthcare technology and techniques based on speech signal and artificial intelligence make it feasible to provide a faster and more efficient timely detection of Covid-19. The main objective of our study is to design speech signal-based noninvasive, low-cost, remote diagnosis of Covid-19. In this study, we have developed system to detect Covid-19 from speech signal using Mel frequency magnitude coefficients (MFMC) and machine learning techniques. In order to capture higher-order spectral features, the spectrum is divided into a larger number of subbands with narrower bandwidths as MFMC, which leads to better frequency resolution and less overall noise. As a consequence of an improvement in frequency resolution as well as a decrease in the quantity of noise that is included with the extraction of MFMC, the higher-order MFMCs are able to identify Covid-19 from speech signals with an increased level of accuracy. The procedures for machine learning are often less complicated than those for deep learning, and they may commonly be carried out on regular computers. However, deep learning systems need extensive computing power and data storage. Twelve, twenty-four, thirty, and forty spectral coefficients are obtained using MFMC in our study, and from these coefficients, performance is accessed using machine learning classifiers, such as random forests and K -nearest neighbor (KNN); however, KNN has performed better than the other model with having AUC score of 0.80.

show abstract