2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221)
DOI: 10.1109/icassp.2001.940871
|View full text |Cite
|
Sign up to set email alerts
|

Investigating lightly supervised acoustic model training

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
28
0

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 39 publications
(31 citation statements)
references
References 6 publications
3
28
0
Order By: Relevance
“…Initial Unlabeled Data These self-training methods have been studied in GMM-based acoustic models [76,77,48,146,105,153]. In recent studies [136,45,60,82], self-training methods are also used in DNN-based acoustic model training.…”
Section: Initial Labeled Datamentioning
confidence: 99%
See 1 more Smart Citation
“…Initial Unlabeled Data These self-training methods have been studied in GMM-based acoustic models [76,77,48,146,105,153]. In recent studies [136,45,60,82], self-training methods are also used in DNN-based acoustic model training.…”
Section: Initial Labeled Datamentioning
confidence: 99%
“…Following the conventional self training [76,77,48,146,105,153] approach, we first train an initial DNN-HMM system using the training data, and decode on the test data. For data selection, we pick utterances with the highest average per-frame decoding likelihood and add them to the training data.…”
Section: Comparison To Self Trainingmentioning
confidence: 99%
“…A fair amount of past research has been devoted to improving the acoustic models from un-transcribed speech [5,6,7,8,9], and to adapt language models trained from out-of-domain text to the task at hand. Such methods of improving the LVCSR performance, which subsequently improve KWS performance, are not a focus of this paper.…”
Section: Low-resource Abstractearchmentioning
confidence: 99%
“…The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of DARPA, IARPA, DoD/ARL or the U.S. Government.an LVCSR system -such as 10 hours of transcribed speech corresponding to about 100K words of transcribed text, and a pronunciation lexicon that covers the words in the training data -but accuracy is sufficiently low that considerable improvement in KWS performance is necessary before the system is usable for searching a speech collection.A fair amount of past research has been devoted to improving the acoustic models from un-transcribed speech [5,6,7,8,9], and to adapt language models trained from out-of-domain text to the task at hand. Such methods of improving the LVCSR performance, which subsequently improve KWS performance, are not a focus of this paper.…”
mentioning
confidence: 99%
“…Several experiments have shown that it is possible to achieve reasonable performance using data with erroneous transcriptions [45,46,47]. But no significant work has been done to analyze why the training algorithms are robust to mislabeled transcriptions.…”
Section: Thesis Objective and Organizationmentioning
confidence: 99%