Yunhao Tang scite author profile

Yunhao Tang

5Publications

76Citation Statements Received

102Citation Statements Given

How they've been cited

How they cite others

Affiliations

Second Affiliated Hospital of Chongqing Medical University, Columbia University

Publications

Order By: Most citations

ES-MAML: Simple Hessian-Free Meta Learning

Song¹,

Gao²,

Yang³

et al. 2019

Preprint

View full text Add to dashboard Cite

We introduce ES-MAML, a new framework for solving the model agnostic meta learning (MAML) problem based on Evolution Strategies (ES). Existing algorithms for MAML are based on policy gradients, and incur significant difficulties when attempting to estimate second derivatives using backpropagation on stochastic policies. We show how ES can be applied to MAML to obtain an algorithm which avoids the problem of estimating second derivatives, and is also conceptually simple and easy to implement. Moreover, ES-MAML can handle new types of nonsmooth adaptation operators, and other techniques for improving performance and estimation of ES methods become applicable. We show empirically that ES-MAML is competitive with existing methods and often yields better adaptation with fewer queries. * Equal contribution. † Work performed during Google internship. ‡ Work performed during the Google AI Residency Program.

show abstract

Discretizing Continuous Action Space for On-Policy Optimization

Tang

Agrawal

2020

AAAI

View full text Add to dashboard Cite

In this work, we show that discretizing action space for continuous control is a simple yet powerful technique for on-policy optimization. The explosion in the number of discrete actions can be efficiently addressed by a policy with factorized distribution across action dimensions. We show that the discrete policy achieves significant performance gains with state-of-the-art on-policy optimization algorithms (PPO, TRPO, ACKTR) especially on high-dimensional tasks with complex dynamics. Additionally, we show that an ordinal parameterization of the discrete distribution can introduce the inductive bias that encodes the natural ordering between discrete actions. This ordinal architecture further significantly improves the performance of PPO/TRPO.

show abstract

Learning to Score Behaviors for Guided Policy Optimization

Pacchiano¹,

Parker-Holder²,

Tang³

et al. 2019

Preprint

View full text Add to dashboard Cite

Exploration by Distributional Reinforcement Learning

Tang

Agrawal

2018

View full text Add to dashboard Cite

show abstract

Provably Robust Blackbox Optimization for Reinforcement Learning

Choromański¹,

Pacchiano²,

Parker-Holder³

et al. 2019

Preprint

View full text Add to dashboard Cite

Interest in derivative-free optimization (DFO) and "evolutionary strategies" (ES) has recently surged in the Reinforcement Learning (RL) community, with growing evidence that they can match state of the art methods for policy optimization problems in Robotics. However, it is well known that DFO methods suffer from prohibitively high sampling complexity. They can also be very sensitive to noisy rewards and stochastic dynamics. In this paper, we propose a new class of algorithms, called Robust Blackbox Optimization (RBO). Remarkably, even if up to 23% of all the measurements are arbitrarily corrupted, RBO can provably recover gradients to high accuracy. RBO relies on learning gradient flows using robust regression methods to enable off-policy updates. On several MuJoCo robot control tasks, when all other RL approaches collapse in the presence of adversarial noise, RBO is able to train policies effectively. We also show that RBO can be applied to legged locomotion tasks including path tracking for quadruped robots. * Equal contribution.Preprint. Under review.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yunhao Tang

ES-MAML: Simple Hessian-Free Meta Learning

Discretizing Continuous Action Space for On-Policy Optimization

Learning to Score Behaviors for Guided Policy Optimization

Exploration by Distributional Reinforcement Learning

Provably Robust Blackbox Optimization for Reinforcement Learning

Contact Info

Product

Resources

About