2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2014
DOI: 10.1109/icassp.2014.6854673
|View full text |Cite
|
Sign up to set email alerts
|

Joint acoustic modeling of triphones and trigraphemes by multi-task learning deep neural networks for low-resource speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
35
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 61 publications
(35 citation statements)
references
References 17 publications
0
35
0
Order By: Relevance
“…It aligns a sequence of tokens automatically. As in traditional systems, phones, graphemes or both combined can be used as acoustic modeling units [13]. Given enough training data, even whole words can be used [14].…”
Section: Rnn Based Asr Systemsmentioning
confidence: 99%
“…It aligns a sequence of tokens automatically. As in traditional systems, phones, graphemes or both combined can be used as acoustic modeling units [13]. Given enough training data, even whole words can be used [14].…”
Section: Rnn Based Asr Systemsmentioning
confidence: 99%
“…When the multiple tasks are related but not identical, or (in the ideal case) complementary to each other, MTL models offer better generalization from training to test corpus [9]. A number of works [9,10,11] have proved MTL to be effective on speech processing tasks. Among them [11] proved MTL effective at improving model performance for under-resourced ASR.…”
Section: Multi-task Learningmentioning
confidence: 99%
“…A number of works [9,10,11] have proved MTL to be effective on speech processing tasks. Among them [11] proved MTL effective at improving model performance for under-resourced ASR.…”
Section: Multi-task Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…No explicit modelling of context-dependent targets as in traditional systems is required. Phones, graphemes or both can be used as acoustic modeling units [15]. Training on whole words is also possible, given enough training data [16].…”
Section: Rnn Based Asr Systemsmentioning
confidence: 99%