Deep Reinforcement Learning for Solving the Heterogeneous Capacitated Vehicle Routing Problem

Li, Jingwen; Ma, Yining; Gao, Ruize; Cao, Zhiguang; Lim, Andrew; Song, Wen; Zhang, Jie

doi:10.1109/tcyb.2021.3111082

Cited by 55 publications

(27 citation statements)

References 50 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The comparison studies investigating properties of deep learning models for specific tasks such as graph embedding and a solution decoding for VRPs are needed. Online DVRP Vera and Abad [59] mCVRP Zhang et al [51] mVRP with soft TW Sykora et al [62] Multi-agent Mapping Problem Lin et al [95] online ride-sharing Falkner and Schmidt-Thieme [60] mCVRPTW Qin et al [115] VRP Silva et al [76] mVRPTW Van Knippenberg et al [38] mCVRP Bogyrbayeva et al [8] mCVRP with charging Bogyrbayeva et al [116] TSP with Drone Chen et al [99] SDDPVD Lin et al [57] mEVRPTW Gutierrez-Rodríguez et al [102] mVRPTW Bono et al [117] mDSCVRPTW, mSCVRPTW Li et al [68] MMHCVRP, MSHCVRP c) Incorporating Uncertainty and Online Routing: The power of learning models lies in their ability to generalize over data distribution. This advantage can be well exploited to incorporate dynamic elements in routing problems, which is not easy to do with the traditional methods [78].…”

Section: The Future Research Directionsmentioning

confidence: 99%

Learning to Solve Vehicle Routing Problems: A Survey

Bogyrbayeva¹,

Meraliyev²,

Mustakhov³

et al. 2022

Preprint

View full text Add to dashboard Cite

This paper provides a systematic overview of machine learning methods applied to solve NP-hard Vehicle Routing Problems (VRPs). Recently, there has been a great interest from both machine learning and operations research communities to solve VRPs either by pure learning methods or by combining them with the traditional hand-crafted heuristics. We present the taxonomy of the studies for learning paradigms, solution structures, underlying models, and algorithms. We present in detail the results of the state-of-the-art methods demonstrating their competitiveness with the traditional methods. The paper outlines the future research directions to incorporate learning-based solutions to overcome the challenges of modern transportation systems.

show abstract

Section: The Future Research Directionsmentioning

confidence: 99%

Learning to Solve Vehicle Routing Problems: A Survey

Bogyrbayeva¹,

Meraliyev²,

Mustakhov³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…The pseudocode of our ITS is presented in Algorithm 1. As for the training algorithm, we use the rollout-based REINFORCE which performs well in training Transformer-style DRL models [41], [42], [49]. The gradient of the loss function is calculated below:…”

Section: Interactive Training Strategymentioning

confidence: 99%

“…Following the data generation in [26], [49], [50], we randomly sample the location of tasks and depot in the uniform distribution U [0, 1]. Two scenarios with four and six UAVs are considered (named U4 and U6, respectively), and each scenario is classified into small-scale situations, medium-scale situations, and large-scale situations according to the problem size.…”

Section: A Experiments Settingsmentioning

confidence: 99%

DL-DRL: A double-layer deep reinforcement learning approach for large-scale task scheduling of multi-UAV

Wu¹,

Fan²

2022

Preprint

View full text Add to dashboard Cite

This paper studies deep reinforcement learning (DRL) for the task scheduling problem of multiple unmanned aerial vehicles (UAVs). Current approaches generally use exact and heuristic algorithms to solve the problem, while the computation time rapidly increases as the task scale grows and heuristic rules need manual design. As a self-learning method, DRL can obtain a high-quality solution quickly without hand-engineered rules. However, the huge decision space makes the training of DRL models becomes unstable in situations with large-scale tasks. In this work, to address the large-scale problem, we develop a divide and conquer-based framework (DCF) to decouple the original problem into a task allocation and a UAV route planning subproblems, which are solved in the upper and lower layers, respectively. Based on DCF, a double-layer deep reinforcement learning approach (DL-DRL) is proposed, where an upper-layer DRL model is designed to allocate tasks to appropriate UAVs and a lower-layer DRL model [i.e., the widely used attention model (AM)] is applied to generate viable UAV routes. Since the upper-layer model determines the input data distribution of the lower-layer model, and its reward is calculated via the lower-layer model during training, we develop an interactive training strategy (ITS), where the whole training process consists of pre-training, intensive training, and alternate training processes. Experimental results show that our DL-DRL outperforms mainstream learningbased and most traditional methods, and is competitive with the state-of-the-art heuristic method [i.e., OR-Tools], especially on large-scale problems. The great generalizability of DL-DRL is also verified by testing the model learned for a problem size to larger ones. Furthermore, an ablation study demonstrates that our ITS can reach a compromise between the model performance and training duration.

show abstract

“…Different from the conventional methods, this line of works aims at automatically searching heuristic policies by using neural networks to learn the underlying patterns in instances, which could be used to discover better policies than hand-crafted ones (Bengio, Lodi, and Prouvost 2021). Towards reducing the gaps to the highly optimized conventional heuristic solvers including Concorde (Applegate et al 2006) and LKH (Helsgaun 2000), a large number of efforts have been performed to invent various deep models to solve the VRP variants, i.e., traveling salesman problem (TSP) and capacitated vehicle routing problem (CVRP) (Khalil et al 2017;Kool, van Hoof, and Welling 2019;Chen and Tian 2019;Hottung and Tierney 2020;Ma et al 2021;Wu et al 2021;Kwon et al 2020;Li et al 2021;Xin et al 2021b).…”

Section: Introductionmentioning

confidence: 99%

Learning to Solve Routing Problems via Distributionally Robust Optimization

Jiang

Cao

et al. 2022

AAAI

Self Cite

View full text Add to dashboard Cite

Recent deep models for solving routing problems always assume a single distribution of nodes for training, which severely impairs their cross-distribution generalization ability. In this paper, we exploit group distributionally robust optimization (group DRO) to tackle this issue, where we jointly optimize the weights for different groups of distributions and the parameters for the deep model in an interleaved manner during training. We also design a module based on convolutional neural network, which allows the deep model to learn more informative latent pattern among the nodes. We evaluate the proposed approach on two types of well-known deep models including GCN and POMO. The experimental results on the randomly synthesized instances and the ones from two benchmark dataset (i.e., TSPLib and CVRPLib) demonstrate that our approach could significantly improve the cross-distribution generalization performance over the original models.

show abstract

Deep Reinforcement Learning for Solving the Heterogeneous Capacitated Vehicle Routing Problem

Cited by 55 publications

References 50 publications

Learning to Solve Vehicle Routing Problems: A Survey

Learning to Solve Vehicle Routing Problems: A Survey

DL-DRL: A double-layer deep reinforcement learning approach for large-scale task scheduling of multi-UAV

Learning to Solve Routing Problems via Distributionally Robust Optimization

Contact Info

Product

Resources

About