“…They achieved the highest UAR of 62.70% for the consonants group of phonemes. In our previous study [21], we achieved 61.93% UAR using the MFCC features extracted from the vowel-like region segments of speech for cold and healthy speech classification.…”
Section: Resultsmentioning
confidence: 99%
“…In stateof-the-art methods, all the frames of CAS are processed for feature extraction. In our previous study [21], we used a VLR of speech which reduces the number of frames by 50.76% for the classification of cold and healthy speech. But UAR achieved using MFCC features extracted from VLR is low compared to the state-of-the-art methods.…”
Section: Resultsmentioning
confidence: 99%
“…Vicente et al[20] developed a Fisher vector for identifying cold speech using MFCC features and a generative Gaussian mixture model. In our previous study[21], we have utilized only vowel-like region segments of speech for cold and healthy speech classification. Vowel-like regions (VLR) were separated from speech signals by identifying vowel-like region onset points and endpoints using the Hilbert envelope of linear prediction residual signal and zero-frequency filtering methods.…”
This work investigates the significance of the voiced and unvoiced region for detecting common cold from the speech signal. In literature, the entire speech signal is processed to detect the common cold and other diseases. This study uses a short-time energy-based approach to segment the voiced and unvoiced region of the speech signal. Then, frame-wise mel frequency cepstral coefficients (MFCC) features are extracted from the voiced and unvoiced segments of each speech utterance, and statistics (mean, variance, skewness, and kurtosis) are calculated to get the feature vector for each speech utterance. The support vector machine (SVM) is utilized to analyze the performance of features extracted from the voiced and unvoiced region. Result shows that the feature extracted from voiced segments, unvoiced segments, and complete active speech (CAS) gives almost similar results using the MFCC features and SVM classifier. Therefore, rather than processing the CAS, we can process the unvoiced speech segments, which have fewer frames compared to CAS and voiced regions of speech. The processing of solely unvoiced segments can reduce the time and computation complexity of a speech signal-based common cold detection system.
“…They achieved the highest UAR of 62.70% for the consonants group of phonemes. In our previous study [21], we achieved 61.93% UAR using the MFCC features extracted from the vowel-like region segments of speech for cold and healthy speech classification.…”
Section: Resultsmentioning
confidence: 99%
“…In stateof-the-art methods, all the frames of CAS are processed for feature extraction. In our previous study [21], we used a VLR of speech which reduces the number of frames by 50.76% for the classification of cold and healthy speech. But UAR achieved using MFCC features extracted from VLR is low compared to the state-of-the-art methods.…”
Section: Resultsmentioning
confidence: 99%
“…Vicente et al[20] developed a Fisher vector for identifying cold speech using MFCC features and a generative Gaussian mixture model. In our previous study[21], we have utilized only vowel-like region segments of speech for cold and healthy speech classification. Vowel-like regions (VLR) were separated from speech signals by identifying vowel-like region onset points and endpoints using the Hilbert envelope of linear prediction residual signal and zero-frequency filtering methods.…”
This work investigates the significance of the voiced and unvoiced region for detecting common cold from the speech signal. In literature, the entire speech signal is processed to detect the common cold and other diseases. This study uses a short-time energy-based approach to segment the voiced and unvoiced region of the speech signal. Then, frame-wise mel frequency cepstral coefficients (MFCC) features are extracted from the voiced and unvoiced segments of each speech utterance, and statistics (mean, variance, skewness, and kurtosis) are calculated to get the feature vector for each speech utterance. The support vector machine (SVM) is utilized to analyze the performance of features extracted from the voiced and unvoiced region. Result shows that the feature extracted from voiced segments, unvoiced segments, and complete active speech (CAS) gives almost similar results using the MFCC features and SVM classifier. Therefore, rather than processing the CAS, we can process the unvoiced speech segments, which have fewer frames compared to CAS and voiced regions of speech. The processing of solely unvoiced segments can reduce the time and computation complexity of a speech signal-based common cold detection system.
“…The MFCC is considered to be the most important characteristic of all aspects of speech signal processing, including speech pathology and speech emotion detection. The MFCCs is extracted using the principles underlying human sound perception [14][15][16][17]. The procedures involved in obtaining the MFCC are explained in Fig.…”
Section: Mel Frequency Cepstral Coefficientmentioning
The Covid-19 pandemic is one of the most significant global health concerns that have emerged in this decade. Intelligent healthcare technology and techniques based on speech signal and artificial intelligence make it feasible to provide a faster and more efficient timely detection of Covid-19. The main objective of our study is to design speech signal-based noninvasive, low-cost, remote diagnosis of Covid-19. In this study, we have developed system to detect Covid-19 from speech signal using Mel frequency magnitude coefficients (MFMC) and machine learning techniques. In order to capture higher-order spectral features, the spectrum is divided into a larger number of subbands with narrower bandwidths as MFMC, which leads to better frequency resolution and less overall noise. As a consequence of an improvement in frequency resolution as well as a decrease in the quantity of noise that is included with the extraction of MFMC, the higher-order MFMCs are able to identify Covid-19 from speech signals with an increased level of accuracy. The procedures for machine learning are often less complicated than those for deep learning, and they may commonly be carried out on regular computers. However, deep learning systems need extensive computing power and data storage. Twelve, twenty-four, thirty, and forty spectral coefficients are obtained using MFMC in our study, and from these coefficients, performance is accessed using machine learning classifiers, such as random forests and
K
-nearest neighbor (KNN); however, KNN has performed better than the other model with having AUC score of 0.80.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.