2021
DOI: 10.3390/app11062477
|View full text |Cite
|
Sign up to set email alerts
|

A Speech Command Control-Based Recognition System for Dysarthric Patients Based on Deep Learning Technology

Abstract: Voice control is an important way of controlling mobile devices; however, using it remains a challenge for dysarthric patients. Currently, there are many approaches, such as automatic speech recognition (ASR) systems, being used to help dysarthric patients control mobile devices. However, the large computation power requirement for the ASR system increases implementation costs. To alleviate this problem, this study proposed a convolution neural network (CNN) with a phonetic posteriorgram (PPG) speech feature s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 20 publications
(13 citation statements)
references
References 66 publications
0
13
0
Order By: Relevance
“…For our first experiment, we chose the Mandarin commands recognition benchmark [30] collected from Dysarthric patients as private data. This benchmark dataset includes ten highfrequent action commands: close, up, down, previous, next, in, out, left, right, and home; and nine spoken digits: one, two, three, four, five, six, seven, eight, and nine with 16kHz sampling rate in a total of 600 utterances.…”
Section: Spoken Command Recognition and Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…For our first experiment, we chose the Mandarin commands recognition benchmark [30] collected from Dysarthric patients as private data. This benchmark dataset includes ten highfrequent action commands: close, up, down, previous, next, in, out, left, right, and home; and nine spoken digits: one, two, three, four, five, six, seven, eight, and nine with 16kHz sampling rate in a total of 600 utterances.…”
Section: Spoken Command Recognition and Resultsmentioning
confidence: 99%
“…This benchmark dataset includes ten highfrequent action commands: close, up, down, previous, next, in, out, left, right, and home; and nine spoken digits: one, two, three, four, five, six, seven, eight, and nine with 16kHz sampling rate in a total of 600 utterances. Adopting the experimental setting described in [30], we split the audio data into 70% and 30% for training and testing set under a 7-folds crossvalidation scheme. To set up public data for training student model, we use the public Common Voice dataset [31] and collect the same Mandarin command actions and 600 utterances from the Dysarthric dataset.…”
Section: Spoken Command Recognition and Resultsmentioning
confidence: 99%
“…A CNN for audio digit classification with Mel spectrogram received 97.53%. A phonetic posteriorgram (PPG) speech feature with CNN was applied in speech command controlbased recognition [42]. The dataset was created by 3 cerebral palsy (CP) patients who spoke 19 Mandarin commands 10 times each.…”
Section: Speech Recognition Tasksmentioning
confidence: 99%
“…According to previous research, CNNs possess strong adaptability and gradually have become the main research tool in the field of image and speech [43,44]. In the study of speaker recognition, the spectrogram [45] gives a large amount of information including the personality characteristics of the speaker, and dynamically shows the characteristics of the signal spectrum change.…”
Section: Convolutional Neural Network (Cnn)mentioning
confidence: 99%