Imitation Learning describes the problem of recovering an expert policy from demonstrations. While inverse reinforcement learning approaches are known to be very sample-efficient in terms of expert demonstrations, they usually require problem-dependent reward functions or a (task-)specific reward-function regularization. In this paper, we show a natural connection between inverse reinforcement learning approaches and Optimal Transport, that enables more general reward functions with desirable properties (e.g. smoothness). Based on our observation, we propose a novel approach called Wasserstein Adversarial Imitation Learning. Our approach considers the Kantorovich potentials as a reward function and further leverages regularized optimal transport to enable large-scale applications. In several robotic experiments, our approach outperforms the baselines in terms of average cumulative rewards and shows a significant improvement in sample-efficiency, by requiring just one expert demonstration.Preprint. Under review.
We propose an approach for learning the causal structure in stochastic dynamical systems with a 1-step functional dependency in the presence of latent variables. We propose an information-theoretic approach that allows us to recover the causal relations among the observed variables as long as the latent variables evolve without exogenous noise. We further propose an efficient learning method based on linear regression for the special sub-case when the dynamics are restricted to be linear. We validate the performance of our approach via numerical simulations.
We propose an approach for learning latent directed polytrees as long as there exists an appropriately defined discrepancy measure between the observed nodes. Specifically, we use our approach for learning directed information polytrees where samples are available from only a subset of processes. Directed information trees are a new type of probabilistic graphical models that represent the causal dynamics among a set of random processes in a stochastic system. We prove that the approach is consistent for learning minimal latent directed trees. We analyze the sample complexity of the learning task when the empirical estimator of mutual information is used as the discrepancy measure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.