2021 3rd International Conference on Computer Communication and the Internet (ICCCI) 2021
DOI: 10.1109/iccci51764.2021.9486811
|View full text |Cite
|
Sign up to set email alerts
|

End-to-End Model Based on RNN-T for Kazakh Speech Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
3

Relationship

1
9

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 15 publications
0
3
0
Order By: Relevance
“…A number of models [ 38 , 39 , 40 , 41 , 42 , 43 , 44 ] have been created for the speech recognition of the Kazakh language. The complexity with regard to Kazakh, its distinctive features, the scarcity of emotional speech datasets, and other factors make it difficult to develop a model for emotional speech detection in this language.…”
Section: Related Workmentioning
confidence: 99%
“…A number of models [ 38 , 39 , 40 , 41 , 42 , 43 , 44 ] have been created for the speech recognition of the Kazakh language. The complexity with regard to Kazakh, its distinctive features, the scarcity of emotional speech datasets, and other factors make it difficult to develop a model for emotional speech detection in this language.…”
Section: Related Workmentioning
confidence: 99%
“…In the field of Kazakh speech recognition, Mamyrbayev et al [17] investigated the implementation of an end-to-end model based on RNN-T. They focused on streaming speech recognition, in which the audio stream is directly converted to text in real time.…”
Section: Related Workmentioning
confidence: 99%
“…Mamyrbayev et al [15] introduce stream speech recognition using the RNN-T model in their study. The architecture of the model is constructed using neural networks such as LSTM and BLSTM, and it was trained using over 300 h of prepared (reading) and spontaneous speech data.…”
Section: Xlsr-53mentioning
confidence: 99%