ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8683438
|View full text |Cite
|
Sign up to set email alerts
|

Conditional Teacher-student Learning

Abstract: The teacher-student (T/S) learning has been shown to be effective for a variety of problems such as domain adaptation and model compression. One shortcoming of the T/S learning is that a teacher model, not always perfect, sporadically produces wrong guidance in form of posterior probabilities that misleads the student model towards a suboptimal performance. To overcome this problem, we propose a conditional T/S learning scheme, in which a "smart" student model selectively chooses to learn from either the teach… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
57
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 73 publications
(58 citation statements)
references
References 38 publications
1
57
0
Order By: Relevance
“…Compared to the one-hot labels, the soft posteriors accurately models the inherent statistical relationships among different token classes in addition to the token identity encoded by a one-hot vector. It proves to be a more powerful target for the student to learn from which is consistent with what was observed in [18,19,20,21,22].…”
Section: Unsupervised Domain Adaptation With T/s Learningsupporting
confidence: 83%
See 3 more Smart Citations
“…Compared to the one-hot labels, the soft posteriors accurately models the inherent statistical relationships among different token classes in addition to the token identity encoded by a one-hot vector. It proves to be a more powerful target for the student to learn from which is consistent with what was observed in [18,19,20,21,22].…”
Section: Unsupervised Domain Adaptation With T/s Learningsupporting
confidence: 83%
“…To address this issue, conditional T/S learning (CT/S) [21] was proposed recently in which the student selectively chooses to learn from either the teacher AED or the ground truth conditioned on whether the teacher AED can correctly predict the ground-truth labels. CT/S have shown significant WER improvements over T/S and IT/S for both domain and speaker adaptation on CHiME-3 dataset.…”
Section: Adaptive T/s (At/s) Learning For Supervised Domain Adaptatiomentioning
confidence: 99%
See 2 more Smart Citations
“…As a result of their experiments they observed a significant improvement in accuracy. Meng Z. et al (2019) used a "smart" Teacher-Student model for domain adaption and speaker adaption in automatic speech recognition. Their model selectively chooses to learn from either the teacher model or the gold standard labels conditioned on whether the teacher can correctly predict the gold standard.…”
Section: Related Workmentioning
confidence: 99%