2021
DOI: 10.48550/arxiv.2106.12699
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Distilling the Knowledge from Conditional Normalizing Flows

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 0 publications
0
2
0
Order By: Relevance
“…These provide an alternative measure of closure that has improved convergence and convexity properties compared to the KL divergence. Other works in the ML literature that explore variants on the idea of Probability Density Distillation include [25][26][27].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…These provide an alternative measure of closure that has improved convergence and convexity properties compared to the KL divergence. Other works in the ML literature that explore variants on the idea of Probability Density Distillation include [25][26][27].…”
Section: Introductionmentioning
confidence: 99%
“…However, the difference between the last two is quite small, so not adding L x (i) and L z (i) to the loss function provides a further viable loss candidate.5 In principle, the student does not have to be an IAF, it could also be a simple, fully-connected neural network[25]. However, in this case we would not have access to the LL as a measure of quality and we would not be able to train it with the additional loss terms of (2.12)-(2.15).…”
mentioning
confidence: 99%