The Use of Wavelet Packet Transform and Artificial Neural Networks in Analysis and Classification of Dysphonic Voices

Crovato, César David Paredes; Schuck, Adalberto

doi:10.1109/tbme.2006.889780

Cited by 30 publications

(7 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The popular parameters in this category are computed using the fractal dimension or the correlation dimension [28], [45], [51]- [53]. The complex measures investigated in several studies consist of the following: the largest Lyapunov exponent, the recurrence period density entropy, Hurst exponent, detrended fluctuation analysis, approximate entropy, sample entropy, modified sample entropy, Gaussian kernel sample entropy, fuzzy entropy, hidden Markov model (HMM) entropy and Shannon HMM entropy [38], [39], [54], [55]. These features capture the dynamic variants/invariants, long-range correlations, regularity or predictability information present in the signal.…”

Section: Introductionmentioning

confidence: 99%

Analysis and Detection of Pathological Voice Using Glottal Source Features

Kadiri

Alku

2020

IEEE J. Sel. Top. Signal Process.

View full text Add to dashboard Cite

Automatic detection of voice pathology enables objective assessment and earlier intervention for the diagnosis. This study provides a systematic analysis of glottal source features and investigates their effectiveness in voice pathology detection. Glottal source features are extracted using glottal flows estimated with the quasi-closed phase (QCP) glottal inverse filtering method, using approximate glottal source signals computed with the zero frequency filtering (ZFF) method, and using acoustic voice signals directly. In addition, we propose to derive mel-frequency cepstral coefficients (MFCCs) from the glottal source waveforms computed by QCP and ZFF to effectively capture the variations in glottal source spectra of pathological voice. Experiments were carried out using two databases, the Hospital Universitario Príncipe de Asturias (HUPA) database and the Saarbrücken Voice Disorders (SVD) database. Analysis of features revealed that the glottal source contains information that discriminates normal and pathological voice. Pathology detection experiments were carried out using support vector machine (SVM). From the detection experiments it was observed that the performance achieved with the studied glottal source features is comparable or better than that of conventional MFCCs and perceptual linear prediction (PLP) features. The best detection performance was achieved when the glottal source features were combined with the conventional MFCCs and PLP features, which indicates the complementary nature of the features.

show abstract

Section: Introductionmentioning

confidence: 99%

Analysis and Detection of Pathological Voice Using Glottal Source Features

Kadiri

Alku

2020

IEEE J. Sel. Top. Signal Process.

View full text Add to dashboard Cite

show abstract

“…Since most of the vocal response perceived by human ear lies in low frequency range, [6] literatures suggested frequency measure based technique like Mel-frequency scale to get a high resolution in low frequency region, and a low resolution in high frequency region. Wavelet packet based feature extraction is also suggested in literature [2][10] [11] [12]. Reference [13] compared seven breathiness measures with glottal to noise excitation ratio which they established as a discriminator for carcinoma, disturbed and normal speech signals.…”

Section: Literature Studymentioning

confidence: 99%

Feature enhancement for classifier optimization and dimensionality reduction

Shilaskar¹,

Ghatol²

2014

2014 Annual IEEE India Conference (INDICON)

View full text Add to dashboard Cite

Voice is important for professionals like speakers, teachers, actors, singers and it is the important tool for communication. Laryngeal pathologies induce perturbations in the speech signal. Speech signal is discriminated as pathological or healthy based on roughness -breathiness -hoarseness (RBH) in the quality of signal. In recent years pattern recognition along with various signal processing techniques has emerged as an effective non invasive tool for diagnosis of pathological condition. Signal processing techniques tend to generate large number of features representing the signal. Automatic feature reduction techniques are vital in identifying the relevant features and eliminating the redundant ones. We extract features from speech signal using the acoustic analysis. Features are enhanced by alleviating gender bias. Periodic variations in the signal are captured using statistical techniques. We investigate intelligent system to generate reduced feature subset with improvement in diagnostic performance.

show abstract

“…In the last decade, interesting works had been proposed on artificial neural networks for classification problems. For instance, in [23] a classification system to identify voice dysphonia via Wavelet Packet Transform and the Best Basis Algorithm is designed. Outstanding results were reported by reaching from 87.5% to 96.8% of accuracy.…”

Section: Introductionmentioning

confidence: 99%

An Optimized Brain-Based Algorithm for Classifying Parkinson’s Disease

et al. 2020

View full text Add to dashboard Cite

During the last years, highly-recognized computational intelligence techniques have been proposed to treat classification problems. These automatic learning approaches lead to the most recent researches because they exhibit outstanding results. Nevertheless, to achieve this performance, artificial learning methods firstly require fine tuning of their parameters and then they need to work with the best-generated model. This process usually needs an expert user for supervising the algorithm’s performance. In this paper, we propose an optimized Extreme Learning Machine by using the Bat Algorithm, which boosts the training phase of the machine learning method to increase the accuracy, and decreasing or keeping the loss in the learning phase. To evaluate our proposal, we use the Parkinson’s Disease audio dataset taken from UCI Machine Learning Repository. Parkinson’s disease is a neurodegenerative disorder that affects over 10 million people. Although its diagnosis is through motor symptoms, it is possible to evidence the disorder through variations in the speech using machine learning techniques. Results suggest that using the bio-inspired optimization algorithm for adjusting the parameters of the Extreme Learning Machine is a real alternative for improving its performance. During the validation phase, the classification process for Parkinson’s Disease achieves a maximum accuracy of 96.74% and a minimum loss of 3.27%.

show abstract

The Use of Wavelet Packet Transform and Artificial Neural Networks in Analysis and Classification of Dysphonic Voices

Cited by 30 publications

References 12 publications

Analysis and Detection of Pathological Voice Using Glottal Source Features

Analysis and Detection of Pathological Voice Using Glottal Source Features

Feature enhancement for classifier optimization and dimensionality reduction

An Optimized Brain-Based Algorithm for Classifying Parkinson’s Disease

Contact Info

Product

Resources

About