On the Performance Degradation of Speaker Recognition System due to Variation in Speech Characteristics Caused by Physiological Changes

Usman, Mohammed

doi:10.12785/ijcds/060303

Cited by 8 publications

(7 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Lockdown process is valuable because it gives excellent time and scope of testing for a maximum number of patients. Reverse transcription polymerase chain reaction (RT-PCR) is one of the best methods for analyzing and detecting COVID 19 within 48 h (Ghosh et al, 2015(Ghosh et al, , 2016a(Ghosh et al, , 2016bUsman, 2017).…”

Section: Introductionmentioning

confidence: 99%

RETRACTED ARTICLE: An adaptive speech signal processing for COVID-19 detection using deep learning approach

Al‐Dhlan

2021

Int J Speech Technol

View full text Add to dashboard Cite

Section: Introductionmentioning

confidence: 99%

RETRACTED ARTICLE: An adaptive speech signal processing for COVID-19 detection using deep learning approach

Al‐Dhlan

2021

Int J Speech Technol

View full text Add to dashboard Cite

“…The time series are represented by 13 Mel Frequency Cepstral Coefficients (MFCC) (Davis and Mermelstein, 1980), sampled at 11025 Hz. MFCC constitute a widely used feature for tasks such as speech recognition (Usman, 2017). They mimic the transformation of the audio signal by the inner ear and are a model of how sound stimuli are "perceived" by the early neuronal auditory system.…”

Section: Spoken Arabic Digits Datasetmentioning

confidence: 99%

Covariance-based information processing in reservoir computing systems

Lawrie¹,

Moreno-Bote

Gilson

2021

Preprint

View full text Add to dashboard Cite

In biological neuronal networks, information representation and processing are achieved through plasticity learning rules that have been empirically characterized as sensitive to second and higher-order statistics in spike trains. However, most models in both computational neuroscience and machine learning aim to convert diverse statistical properties in inputs into first-order statistics in outputs, like in modern deep learning tools. In the context of classification, such schemes have merit for inputs like static images, but they are not well suited to capture the temporal structure in time series. In contrast, the recently developed covariance perceptron uses second-order statistics by mapping input covariances to output covariances in a consistent fashion. Here, we explore the applicability of covariance-based perceptron readouts in reservoir computing networks to classify synthetic multivariate time series structured at different statistical orders (first and second). We show that the second-order framework outperforms or matches the classical mean paradigm in terms of accuracy. We expose nontrivial relationships between input, reservoir and output dynamics, which suggest an important role for recurrent connectivity in transforming information representations in biologically inspired architectures. Finally, we solve a real automatic speech recognition task for the classification of spoken digits to further demonstrate the potential of covariance-based decoding.

show abstract

“…MFCC is used because it can represent spectral details of speech signals along with temporal variations in the spectral details. A detailed description of MFCC computation procedure is given in [11], [12]. The MFCC data generated in this work uses a Hamming window of length 256 samples which corresponds to speech frame duration of 16 ms, with 50% overlap between adjacent windows.…”

Section: Speech Features Datamentioning

confidence: 99%

Dataset of Raw and Pre-processed Speech Signals, Mel Frequency Cepstral Coefficients of Speech and Heart Rate Measurements

Usman

Ahmad

Wajid

2019

2019 5th International Conference on Signal Processing, Computing and Control (ISPCC)

Self Cite

View full text Add to dashboard Cite

Heart rate is an important vital sign used in the diagnosis of many medical conditions. Conventionally, heart rate is measured using a medical device such as pulse oxymeter. Physiological parameters such as heart rate bear a correlation to speech characteristics of an individual. Hence, there is a possibility to measure heart rate from speech signals using machine learning and deep learning, which would also allow non-invasive, non contact based and remote monitoring of patients. However, to design such a scheme and verify its accuracy, it is necessary to collect speech recordings along with heart rates measured using a medical device, simultaneously during the recording. This article provides a dataset as well as the procedure used to create the dataset which could be used to facilitate research in developing techniques to estimate heart rate accurately by observing speech signal. KeywordsHeart rate measurement, speech as a biomedical signal, heart rate from speech, speech biomedical dataset, speech-heart rate dataset.

show abstract

On the Performance Degradation of Speaker Recognition System due to Variation in Speech Characteristics Caused by Physiological Changes

Cited by 8 publications

References 20 publications

RETRACTED ARTICLE: An adaptive speech signal processing for COVID-19 detection using deep learning approach

RETRACTED ARTICLE: An adaptive speech signal processing for COVID-19 detection using deep learning approach

Covariance-based information processing in reservoir computing systems

Dataset of Raw and Pre-processed Speech Signals, Mel Frequency Cepstral Coefficients of Speech and Heart Rate Measurements

Contact Info

Product

Resources

About