2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI) 2018
DOI: 10.1109/icacci.2018.8554455
|View full text |Cite
|
Sign up to set email alerts
|

Speech Recognition and Correction of a Stuttered Speech

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 15 publications
(13 citation statements)
references
References 3 publications
0
13
0
Order By: Relevance
“…Regarding the feasibility of automatic extraction, for the pacing of syllable pronounciation, because of MONAH's reliance on Google Speech to text, the returned timings were at word-level instead of the required syllable-level. Commercial systems typically return timings at word-level or sentence-level and it would take a specialized speech recognition system to return syllable-level information [50]. As for the volume and pitch, granular timestamped information could be readily extracted through open-source packages like OpenSmile [51].…”
Section: Discussion Of Results From Aspects Non-verbal Annotationsmentioning
confidence: 99%
“…Regarding the feasibility of automatic extraction, for the pacing of syllable pronounciation, because of MONAH's reliance on Google Speech to text, the returned timings were at word-level instead of the required syllable-level. Commercial systems typically return timings at word-level or sentence-level and it would take a specialized speech recognition system to return syllable-level information [50]. As for the volume and pitch, granular timestamped information could be readily extracted through open-source packages like OpenSmile [51].…”
Section: Discussion Of Results From Aspects Non-verbal Annotationsmentioning
confidence: 99%
“…Some of the bottlenecks of the above papers are speech recognition systems that are confined to regional languages and some of the papers fails to discuss about correction of stuttered speech. The Amplitude thresholding is done using neural networks [9] but the process is complex. Some of these issues are addressed in this paper which discusses about different methods for removal of prolongations and string repetitions.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The focus of this paper is on detection of five stuttering event types: Blocks, Prolongations, Sound Repetitions, Word/Phrase Repetitions, and Interjections. Existing work has explored this problem using traditional signal processing techniques [15,16,17], language modeling (LM) [12,18,19,20,21], and acoustic modeling (AM) [21,10]. Each approach has be shown to be effective 1.…”
Section: Introductionmentioning
confidence: 99%