2019
DOI: 10.48550/arxiv.1909.13082
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Wasserstein-2 Generative Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 0 publications
0
7
0
Order By: Relevance
“…Recall that g satisfies Hypothesis 1. From relation (18) in Theorem 3, we get that for all θ 1 , there exists a neighborhood Ω of θ 1 such that for all θ 2 ∈ Ω…”
Section: Differentiation Of W λ Cmentioning
confidence: 99%
See 1 more Smart Citation
“…Recall that g satisfies Hypothesis 1. From relation (18) in Theorem 3, we get that for all θ 1 , there exists a neighborhood Ω of θ 1 such that for all θ 2 ∈ Ω…”
Section: Differentiation Of W λ Cmentioning
confidence: 99%
“…Among these extensions, the method of [19] considers generic convex costs for optimal transport and relies on low dimentional discrete transport problems on batches during the learning. In [18], the case of the 2-Wasserstein distance is tackled thanks to input convex neural networks [1] and a cycle-consistency regularization. In order to have a differentiable distance, the use of entropic regularization of optimal transport has been proposed in different ways.…”
Section: Introductionmentioning
confidence: 99%
“…The leading approach -the Input Convex Neural Network (ICNN) [8] -models a convex potential which can be differentiated with respect to the inputs to produce a gradient map. Huang et al [9] combine Brenier's theorem with the ICNN gradients to design flow based density estimators, and Makkuva et al [10], Korotin et al [11] use a similar combination to solve high-dimensional barycenter and transport problems. While Huang et al [9] prove a universal approximation theorem for the ICNN, the result relies on stacking a large number of layers.…”
Section: Introductionmentioning
confidence: 99%
“…This happens because the chain rule turns the composition of layers into a product of their corresponding Jacobians. This does not cause issues for training the network on objectives involving the scalar output, like regression, but can become problematic for objectives involving the gradient of the network's output [9][10][11]. Intuitively, the product of layers of a neural network has similarities to a polynomial, and can suffer from oscillations related to the Runge phenomena -see [13].…”
Section: Introductionmentioning
confidence: 99%
“…To get samples from optimal coupling, the traditional methods like Linear Programming [28,34,37] or Sinkhorn [13] usually start with the discretization of the whole continuous space and compute the transport plan for discrete setting as the approximation of the continuous case. Our algorithm can directly output the sample approximation of the optimal coupling without any discretization or training process as neural network method [35,21,26]. This is also very different from other traditional methods like Monge-Ampère Equation [5] or dynamical scheme [4,24,33].…”
Section: Introductionmentioning
confidence: 99%