Navigating Occluded Intersections with Autonomous Vehicles using Deep Reinforcement Learning

Isele, David; Rahimi, Reza; Cosgun, Akansel; Subramanian, K.A.; Fujimura, Kikuo

doi:10.48550/arxiv.1705.01196

Cited by 13 publications

(18 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We evaluated CM3 on the problem of learning cooperative policies for negotiating lane changes among multiple autonomous vehicles in the Simulation of Urban Mobility (SUMO) traffic simulator [11]. While previous work have applied reinforcement learning to autonomous driving tasks in simulation [9,12,20], they modeled the problem as a single-agent MDP, in which other vehicles behave according to hand-designed policies without the capacity for strategic response to the learning agent. However, driving in real-world traffic must involve deliberate cooperation 3 among interacting vehicles who have different individual intentions (e.g.…”

Section: Methodsmentioning

confidence: 99%

CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning

Yang,

Nakhaei,

Isele

et al. 2018

Preprint

Self Cite

View full text Add to dashboard Cite

We propose CM3, a new deep reinforcement learning method for cooperative multi-agent problems where agents must coordinate for joint success in achieving different individual goals. We restructure multi-agent learning into a two-stage curriculum, consisting of a single-agent stage for learning to accomplish individual tasks, followed by a multi-agent stage for learning to cooperate in the presence of other agents. These two stages are bridged by modular augmentation of neural network policy and value functions. We further adapt the actor-critic framework to this curriculum by formulating local and global views of the policy gradient and learning via a double critic, consisting of a decentralized value function and a centralized action-value function. We evaluated CM3 on a new high-dimensional multi-agent environment with sparse rewards: negotiating lane changes among multiple autonomous vehicles in the Simulation of Urban Mobility (SUMO) traffic simulator. Detailed ablation experiments show the positive contribution of each component in CM3, and the overall synthesis converges significantly faster to higher performance policies than existing cooperative multi-agent methods.

show abstract

Section: Methodsmentioning

confidence: 99%

CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning

Yang,

Nakhaei,

Isele

et al. 2018

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Pac-man and Enduro in Fig. 15) and real-world motion planning tasks [31] [51]. DQN utilizes CNN to approximate Q values (Fig.…”

Section: Nature Deep Q-learning Networkmentioning

confidence: 99%

A review of motion planning algorithms for intelligent robots

2021

View full text Add to dashboard Cite

Principles of typical motion planning algorithms are investigated and analyzed in this paper. These algorithms include traditional planning algorithms, classical machine learning algorithms, optimal value reinforcement learning, and policy gradient reinforcement learning. Traditional planning algorithms investigated include graph search algorithms, sampling-based algorithms, interpolating curve algorithms, and reaction-based algorithms. Classical machine learning algorithms include multiclass support vector machine, long short-term memory, Monte-Carlo tree search and convolutional neural network. Optimal value reinforcement learning algorithms include Q learning, deep Q-learning network, double deep Q-learning network, dueling deep Q-learning network. Policy gradient algorithms include policy gradient method, actor-critic algorithm, asynchronous advantage actor-critic, advantage actor-critic, deterministic policy gradient, deep deterministic policy gradient, trust region policy optimization and proximal policy optimization. New general criteria are also introduced to evaluate the performance and application of motion planning algorithms by analytical comparisons. The convergence speed and stability of optimal value and policy gradient algorithms are specially analyzed. Future directions are presented analytically according to principles and analytical comparisons of motion planning algorithms. This paper provides researchers with a clear and comprehensive understanding about advantages, disadvantages, relationships, and future of motion planning algorithms in robots, and paves ways for better motion planning algorithms in academia, engineering, and manufacturing.

show abstract

“…Road users simulation is essential to the development of maneuver decision making modules for automated vehicles. In [12] a system able to enter in an intersection is trained while other vehicles followed a deterministic model called Intelligent Driver Model (IDM, [27]). In [16] a lane change maneuver module is learned using DRL in a scenario where other vehicles follow a simple lane keeping behavior with collision avoidance, while in [14] they are also able to overtake relying on hard-coded rules.…”

Section: Related Workmentioning

confidence: 99%

“…Typical solutions ( [28], [4]) for handling those particular maneuvers consist on rule-based methods which use some notion of the time-to-collision ( [29]), so that they will be executed only if there is enough time in the worst case scenario. These solutions lead to excessively cautious behaviors due to the lack of interpretation of the situation, and suggested the use of machine learning approaches, such as Partially Observable Markov Decision Processes ( [17]) or Deep Learning techniques ( [12]), in order to infer intentions of other drivers. However, training machine learning algorithms of this kind typically requires simulated environments, and so the behavioral simulation of other drivers plays an important role.…”

Section: Introductionmentioning

confidence: 99%

Microscopic Traffic Simulation by Cooperative Multi-agent Deep Reinforcement Learning

Bacchiani,

Molinari,

Patander

2019

Preprint

View full text Add to dashboard Cite

Expert human drivers perform actions relying on traffic laws and their previous experience. While traffic laws are easily embedded into an artificial brain, modeling human complex behaviors which come from past experience is a more challenging task. One of these behaviors is the capability of communicating intentions and negotiating the right of way through driving actions, as when a driver is entering a crowded roundabout and observes other cars movements to guess the best time to merge in. In addition, each driver has its own unique driving style, which is conditioned by both its personal characteristics, such as age and quality of sight, and external factors, such as being late or in a bad mood. For these reasons, the interaction between different drivers is not trivial to simulate in a realistic manner. In this paper, this problem is addressed by developing a microscopic simulator using a Deep Reinforcement Learning Algorithm based on a combination of visual frames, representing the perception around the vehicle, and a vector of numerical parameters. In particular, the algorithm called Asynchronous Advantage Actor-Critic has been extended to a multi-agent scenario in which every agent needs to learn to interact with other similar agents. Moreover, the model includes a novel architecture such that the driving style of each vehicle is adjustable by tuning some of its input parameters, permitting to simulate drivers with different levels of aggressiveness and desired cruising speeds.

show abstract

Navigating Occluded Intersections with Autonomous Vehicles using Deep Reinforcement Learning

Cited by 13 publications

References 0 publications

CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning

CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning

A review of motion planning algorithms for intelligent robots

Microscopic Traffic Simulation by Cooperative Multi-agent Deep Reinforcement Learning

Contact Info

Product

Resources

About