Eshagh Kargar scite author profile

Driving in a dynamic, multi-agent, and complex urban environment is a difficult task requiring a complex decision-making policy. The learning of such a policy requires a state representation that can encode the entire environment. Mid-level representations that encode a vehicle's environment as images have become a popular choice. Still, they are quite highdimensional, limiting their use in data-hungry approaches such as reinforcement learning. In this article, we propose to learn a lowdimensional and rich latent representation of the environment by leveraging the knowledge of relevant semantic factors. To do this, we train an encoder-decoder deep neural network to predict multiple application-relevant factors such as the trajectories of other agents and the ego car. Furthermore, we propose a hazard signal based on other vehicles' future trajectories and the planned route which is used in conjunction with the learned latent representation as input to a down-stream policy. We demonstrate that using the multi-head encoder-decoder neural network results in a more informative representation than a standard single-head model. In particular, the proposed representation learning and the hazard signal help reinforcement learning to learn faster, with increased performance and less data than baseline methods.

show abstract

Learning Based High-Level Decision Making for Abortable Overtaking in Autonomous Vehicles

Malayjerdi¹,

Alcan²,

Kargar³

et al. 2022

Preprint

View full text Add to dashboard Cite

MACRPO: Multi-Agent Cooperative Recurrent Policy Optimization

Kargar¹,

Kyrki²

2021

Preprint

View full text Add to dashboard Cite

This work considers the problem of learning cooperative policies in multiagent settings with partially observable and non-stationary environments without a communication channel. We focus on improving information sharing between agents and propose a new multi-agent actor-critic method called Multi-Agent Cooperative Recurrent Proximal Policy Optimization (MACRPO). We propose two novel ways of integrating information across agents and time in MACRPO: First, we use a recurrent layer in critic's network architecture and propose a new framework to use a metatrajectory to train the recurrent layer. This allows the network to learn the cooperation and dynamics of interactions between agents, and also handle partial observability. Second, we propose a new advantage function that incorporates other agents' rewards and value functions. We evaluate our algorithm on three challenging multi-agent environments with continuous and discrete action spaces, Deepdrive-Zero, Multi-Walker, and Particle environment. We compare the results with several ablations and state-ofthe-art multi-agent algorithms such as QMIX and MADDPG and also single-agent methods with shared parameters between agents such as IMPALA and APEX. The results show superior performance against other algorithms. The code is available online at https://github.com/kargarisaac/macrpo.

show abstract

Increasing the Efficiency of Policy Learning for Autonomous Vehicles by Multi-Task Representation Learning

Kargar¹,

Kyrki²

2021

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Eshagh Kargar

Using an encoder-decoder convolutional neural network to predict the solid holdup patterns in a pseudo-2d fluidized bed

Increasing the Efficiency of Policy Learning for Autonomous Vehicles by Multi-Task Representation Learning

Learning Based High-Level Decision Making for Abortable Overtaking in Autonomous Vehicles

MACRPO: Multi-Agent Cooperative Recurrent Policy Optimization

Increasing the Efficiency of Policy Learning for Autonomous Vehicles by Multi-Task Representation Learning

Contact Info

Product

Resources

About