2017
DOI: 10.1016/j.jvoice.2016.09.009
|View full text |Cite
|
Sign up to set email alerts
|

Intra- and Inter-database Study for Arabic, English, and German Databases: Do Conventional Speech Features Detect Voice Pathology?

Abstract: A large population around the world has voice complications. Various approaches for subjective and objective evaluations have been suggested in the literature. The subjective approach strongly depends on the experience and area of expertise of a clinician, and human error cannot be neglected. On the other hand, the objective or automatic approach is noninvasive. Automatic developed systems can provide complementary information that may be helpful for a clinician in the early screening of a voice disorder. At t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
11
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 30 publications
(12 citation statements)
references
References 45 publications
(52 reference statements)
1
11
0
Order By: Relevance
“…Nevertheless, we consider these results exploratory due to the limitations of the databases. Reviewing the performances achieved in scenarios with MFCC as input data we conclude that MFCC alone are not reliable enough for robust voice pathology detection, which was also concluded by Ali et al in [5]. Regarding the DenseNet, we conclude that in voice pathology detection scenarios with this little training data it is better to use inputs with reduced dimensionality in contrary to raw waveform inputs, or make use of transfer learning or data augmentation.…”
Section: Discussionsupporting
confidence: 51%
See 1 more Smart Citation
“…Nevertheless, we consider these results exploratory due to the limitations of the databases. Reviewing the performances achieved in scenarios with MFCC as input data we conclude that MFCC alone are not reliable enough for robust voice pathology detection, which was also concluded by Ali et al in [5]. Regarding the DenseNet, we conclude that in voice pathology detection scenarios with this little training data it is better to use inputs with reduced dimensionality in contrary to raw waveform inputs, or make use of transfer learning or data augmentation.…”
Section: Discussionsupporting
confidence: 51%
“…After the feature extraction, multiple conventional classifiers have been used to detect the presence of voice pathology. Most authors relied on the following algorithms: Support Vector Machines (SVM), Gaussian Mixture Models (GMM), Random Forests (RF), and Artificial Neural Networks (ANN) [25,5,7,14], etc.…”
Section: Introductionmentioning
confidence: 99%
“…The face images are taken from the Database of Faces [39], which was developed between 1992 and 1994 and was formerly known as the ORL Database of Faces. [40][41][42][43]. The database contains the speech signals of normal persons and dysphonic patients.…”
Section: Smart Citymentioning
confidence: 99%
“…The second module is the classifier used to classify vocal pathologies. we used a hidden Markov model with a Gaussian mixture density (HMM-GM) [9], through The Hidden Markov Model Toolkit (HTK) (HTK 3.4.1) [10] Researchers in this field have frequently used objective assessment of vocal pathology using several databases. We note here the most used databases such as the database (MEEI) [11,3], Saarbruecken Voice Database (SVD) [4,7,12] and Arabic Voice Pathology Database (AVPD) [13,7].…”
Section: Introductionmentioning
confidence: 99%