Hongkang Yang scite author profile

2020

Comm Pure Appl Math

A framework is proposed that addresses both conditional density estimation and latent variable discovery. The objective function maximizes explanation of variability in the data, achieved through the optimal transport barycenter generalized to a collection of conditional distributions indexed by a covariate—either given or latent—in any suitable space. Theoretical results establish the existence of barycenters, a minimax formulation of optimal transport maps, and a general characterization of variability via the optimal transport cost. This framework leads to a family of nonparametric neural network‐based algorithms, the BaryNet, with a supervised version that estimates conditional distributions and an unsupervised version that assigns latent variables. The efficacy of BaryNets is demonstrated by tests on both artificial and real‐world data sets. A parallel drawn between autoencoders and the barycenter framework leads to the Barycentric autoencoder algorithm (BAE). © 2020 Wiley Periodicals LLC.

Clustering, factor discovery and optimal transport

2019

Preprint

The clustering problem, and more generally, latent factor discovery -or latent space inference-is formulated in terms of the Wasserstein barycenter problem from optimal transport. The objective proposed is the maximization of the variability attributable to class, further characterized as the minimization of the variance of the Wasserstein barycenter. Existing theory, which constrains the transport maps to rigid translations, is extended to affine transformations. The resulting non-parametric clustering algorithms include k-means as a special case and exhibit more robust performance. A continuous version of these algorithms discovers continuous latent variables and generalizes principal curves. The strength of these algorithms is demonstrated by tests on both artificial and real-world data sets.

Generalization error of GAN from the discriminator’s perspective

Weinan

2021

Res Math Sci

Conditional Density Estimation, Latent Variable Discovery and Optimal Transport

2019

Preprint

A framework is proposed that addresses both conditional density estimation and latent variable discovery. The objective function maximizes explanation of variability in the data, achieved through the optimal transport barycenter generalized to a collection of conditional distributions indexed by a covariate -either given or latent-in any suitable space. Theoretical results establish the existence of barycenters, a minimax formulation of optimal transport maps, and a general characterization of variability via the optimal transport cost. This framework leads to a family of non-parametric neural network-based algorithms, the BaryNet, with a supervised version that estimates conditional distributions and an unsupervised version that assigns latent variables. The efficacy of BaryNets is demonstrated by tests on both artificial and real-world data sets. A parallel drawn between autoencoders and the barycenter framework leads to the Barycentric autoencoder algorithm (BAE).

Clustering, factor discovery and optimal transport

2020

The clustering problem, and more generally latent factor discovery or latent space inference, is formulated in terms of the Wasserstein barycenter problem from optimal transport. The objective proposed is the maximization of the variability attributable to class, further characterized as the minimization of the variance of the Wasserstein barycenter. Existing theory, which constrains the transport maps to rigid translations, is extended to affine transformations. The resulting non-parametric clustering algorithms include $k$-means as a special case and exhibit more robust performance. A continuous version of these algorithms discovers continuous latent variables and generalizes principal curves. The strength of these algorithms is demonstrated by tests on both artificial and real-world data sets.