Ryo Iwaki scite author profile

Generative adversarial imitation learning (GAIL) has attracted increasing attention in the field of robot learning. It enables robots to learn a policy to achieve a task demonstrated by an expert while simultaneously estimating the reward function behind the expert's behaviors. However, this framework is limited to learning a single task with a single reward function. This study proposes an extended framework called situated GAIL (S-GAIL), in which a task variable is introduced to both the discriminator and generator of the GAIL framework. The task variable has the roles of discriminating different contexts and making the framework learn different reward functions and policies for multiple tasks. To achieve the early convergence of learning and robustness during reward estimation, we introduce a term to adjust the entropy regularization coefficient in the generator's objective function. Our experiments using two setups (navigation in a discrete grid world and arm reaching in a continuous space) demonstrate that the proposed framework can acquire multiple reward functions and policies more effectively than existing frameworks. The task variable enables our framework to differentiate contexts while sharing common knowledge among multiple tasks.

show abstract

Study on the longitudinal control analysis of a driver of a heavy-duty vehicle following another vehicle

Iwaki

Kaneko

Kageyama

2006

Vehicle System Dynamics

View full text Add to dashboard Cite

This article is concerned with the behavior of drivers of heavy-duty vehicles in following the preceding cars. It has been analyzed using a driver model. We first focus on a way for drivers to obtain information so that they can longitudinally control their vehicles. The model is used to describe the two kinds of control methods for drivers; the first is feed-forward, and the second is feed-back. Using multiple regression analysis, the feed-forward model was constructed taking the time-delay and the cut-off frequency of information into consideration. A feed-back model for information that cannot be described by feed-forward control was also constructed. Using a combination of the feed-forward and the feed-back model, a fore-and-aft control model was constructed, and we found that the driver model could provide a good description of the actual driver behavior. Finally, major factors involved in getting control information to the driver are clarified using factor analysis.

show abstract

Adaptive leader-follower role switching based on rhythm stability: Toward modeling of dynamic infant-caregiver interaction

Iwaki

Takahashi

Asada

2014

View full text Add to dashboard Cite

Incremental Estimation of Natural Policy Gradient with Relative Importance Weighting

Iwaki

Yokoyama

Asada

2018

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

Ryo IWAKI †a) , Nonmember, Hiroki YOKOYAMA † †b) , and Minoru ASADA †c) , Members SUMMARYThe step size is a parameter of fundamental importance in learning algorithms, particularly for the natural policy gradient (NPG) methods. We derive an upper bound for the step size in an incremental NPG estimation, and propose an adaptive step size to implement the derived upper bound. The proposed adaptive step size guarantees that an updated parameter does not overshoot the target, which is achieved by weighting the learning samples according to their relative importances. We also provide tight upper and lower bounds for the step size, though they are not suitable for the incremental learning. We confirm the usefulness of the proposed step size using the classical benchmarks. To the best of our knowledge, this is the first adaptive step size method for NPG estimation. key words: reinforcement learning, natural policy gradient, adaptive step size

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ryo Iwaki

Implicit incremental natural actor critic algorithm

Situated GAIL: Multitask imitation using task-conditioned adversarial inverse reinforcement learning

Study on the longitudinal control analysis of a driver of a heavy-duty vehicle following another vehicle

Adaptive leader-follower role switching based on rhythm stability: Toward modeling of dynamic infant-caregiver interaction

Incremental Estimation of Natural Policy Gradient with Relative Importance Weighting

Contact Info

Product

Resources

About