Shengjia Zhao scite author profile

A key advance in learning generative models is the use of amortized inference distributions that are jointly trained with the models. We find that existing training objectives for variational autoencoders can lead to inaccurate amortized inference distributions and, in some cases, improving the objective provably degrades the inference quality. In addition, it has been observed that variational autoencoders tend to ignore the latent variables when combined with a decoding distribution that is too flexible. We again identify the cause in existing training criteria and propose a new class of objectives (Info-VAE) that mitigate these problems. We show that our model can significantly improve the quality of the variational posterior and can make effective use of the latent features regardless of the flexibility of the decoding distribution. Through extensive qualitative and quantitative analyses, we demonstrate that our models outperform competing approaches on multiple performance metrics

show abstract

An “essential herbal medicine”—licorice: A review of phytochemicals and its effects in combination preparations

Jiang

Zhao

Yang

et al. 2020

Journal of Ethnopharmacology

163

View full text Add to dashboard Cite

Towards Deeper Understanding of Variational Autoencoding Models

Zhao¹,

Song²,

Ermon³

2017

Preprint

View full text Add to dashboard Cite

The Information Autoencoding Family: A Lagrangian Perspective on Latent Variable Generative Models

Zhao¹,

Song²,

Ermon³

2018

Preprint

View full text Add to dashboard Cite

A large number of objectives have been proposed to train latent variable generative models. We show that many of them are Lagrangian dual functions of the same primal optimization problem. The primal problem optimizes the mutual information between latent and visible variables, subject to the constraints of accurately modeling the data distribution and performing correct amortized inference. By choosing to maximize or minimize mutual information, and choosing different Lagrange multipliers, we obtain different objectives including InfoGAN, ALI/BiGAN, ALICE, CycleGAN, beta-VAE, adversarial autoencoders, AVB, AS-VAE and InfoVAE. Based on this observation, we provide an exhaustive characterization of the statistical and computational trade-offs made by all the training objectives in this class of Lagrangian duals. Next, we propose a dual optimization method where we optimize model parameters as well as the Lagrange multipliers. This method achieves Pareto optimal solutions in terms of optimizing information and satisfying the constraints.

show abstract

Permutation Invariant Graph Generation via Score-Based Generative Modeling

Niu¹,

Song

et al. 2020

Preprint

View full text Add to dashboard Cite

Learning generative models for graphstructured data is challenging because graphs are discrete, combinatorial, and the underlying data distribution is invariant to the ordering of nodes. However, most of the existing generative models for graphs are not invariant to the chosen ordering, which might lead to an undesirable bias in the learned distribution. To address this difficulty, we propose a permutation invariant approach to modeling graphs, using the recent framework of score-based generative modeling. In particular, we design a permutation equivariant, multi-channel graph neural network to model the gradient of the data distribution at the input graph (a.k.a., the score function). This permutation equivariant model of gradients implicitly defines a permutation invariant distribution for graphs. We train this graph neural network with score matching and sample from it with annealed Langevin dynamics. In our experiments, we first demonstrate the capacity of this new architecture in learning discrete graph algorithms. For graph generation, we find that our learning approach achieves better or comparable results to existing models on benchmark datasets.

show abstract

Learning Hierarchical Features from Generative Models

Zhao¹,

Song²,

Ermon³

2017

Preprint

View full text Add to dashboard Cite

Deep neural networks have been shown to be very successful at learning feature hierarchies in supervised learning tasks. Generative models, on the other hand, have benefited less from hierarchical models with multiple layers of latent variables. In this paper, we prove that hierarchical latent variable models do not take advantage of the hierarchical structure when trained with existing variational methods, and provide some limitations on the kind of features existing models can learn. Finally we propose an alternative architecture that do not suffer from these limitations. Our model is able to learn highly interpretable and disentangled hierarchical features on several natural image datasets with no task specific regularization or prior knowledge.

show abstract

Closing the Gap Between Short and Long XORs for Model Counting

Zhao

Chaturapruek

Sabharwal

et al. 2016

AAAI

View full text Add to dashboard Cite

Many recent algorithms for approximate model counting are based on a reduction to combinatorial searches over random subsets of the space defined by parity or XOR constraints. Long parity constraints (involving many variables) provide strong theoretical guarantees but are computationally difficult. Short parity constraints are easier to solve but have weaker statistical properties. It is currently not known how long these parity constraints need to be. We close the gap by providing matching necessary and sufficient conditions on the required asymptotic length of the parity constraints. Further, we provide a new family of lower bounds and the first non-trivial upper bounds on the model count that are valid for arbitrarily short XORs. We empirically demonstrate the effectiveness of these bounds on model counting benchmarks and in a Satisfiability Modulo Theory (SMT) application motivated by the analysis of contingency tables in statistics.

show abstract

Mobility network reveals the impact of spatial vaccination heterogeneity on COVID-19

Yuan

Jahani

Zhao³

et al. 2021

Preprint

View full text Add to dashboard Cite

Massive vaccination is one of the most effective epidemic control measures. Because one’s vaccination decision is shaped by social processes (e.g., socioeconomic sorting and social contagion), the pattern of vaccine uptake tends to show strong social and geographical heterogeneity, such as urban-rural divide and clustering. Yet, little is known to what extent and how the vaccination heterogeneity affects the course of outbreaks. Here, leveraging the unprecedented availability of data and computational models produced during the COVID-19 pandemic, we investigate two network effects—the “hub effect” (hubs in the mobility network usually have higher vaccination rates) and the “homophily effect” (neighboring places tend to have similar vaccination rates). Applying Bayesian deep learning and fine-grained simulations for the U.S., we show that stronger homophily leads to more infections while a stronger hub effect results in fewer cases. Our simulation estimates that these effects have a combined net negative impact on the outcome, increasing the total cases by approximately 10% in the U.S. Inspired by these results, we propose a vaccination campaign strategy that targets a small number of regions to further improve the vaccination rate, which can reduce the number of cases by 20% by only vaccinating an additional 1% of the population according to our simulations. Our results suggest that we must examine the interplay between vaccination patterns and mobility networks beyond the overall vaccination rate, and that the government may need to shift policy focus from overall vaccination rates to geographical vaccination heterogeneity.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shengjia Zhao

InfoVAE: Balancing Learning and Inference in Variational Autoencoders

An “essential herbal medicine”—licorice: A review of phytochemicals and its effects in combination preparations

Towards Deeper Understanding of Variational Autoencoding Models

The Information Autoencoding Family: A Lagrangian Perspective on Latent Variable Generative Models

Permutation Invariant Graph Generation via Score-Based Generative Modeling

Learning Hierarchical Features from Generative Models

Closing the Gap Between Short and Long XORs for Model Counting

Mobility network reveals the impact of spatial vaccination heterogeneity on COVID-19

Contact Info

Product

Resources

About