Wen Song scite author profile

Recently, there is an emerging trend to apply deep reinforcement learning to solve the vehicle routing problem (VRP), where a learnt policy governs the selection of next node for visiting. However, existing methods could not handle well the pairing and precedence relationships in the pickup and delivery problem (PDP), which is a representative variant of VRP. To address this challenging issue, we leverage a novel neural network integrated with a heterogeneous attention mechanism to empower the policy in deep reinforcement learning to automatically select the nodes. In particular, the heterogeneous attention mechanism specifically prescribes attentions for each role of the nodes while taking into account the precedence constraint, i.e., the pickup node must precede the pairing delivery node. Further integrated with a masking scheme, the learnt policy is expected to find higher-quality solutions for solving PDP. Extensive experimental results show that our method outperforms the state-of-the-art heuristic and deep learning model, respectively, and generalizes well to different distributions and problem sizes.

show abstract

Learning to Solve Multiple-TSP With Time Window and Rejections via Deep Reinforcement Learning

Zhang

Cao

et al. 2023

IEEE Trans. Intell. Transport. Syst.

View full text Add to dashboard Cite

We propose a manager-worker framework 1 based on deep reinforcement learning to tackle a hard yet nontrivial variant of Travelling Salesman Problem (TSP), i.e., multiplevehicle TSP with time window and rejections (mTSPTWR), where customers who cannot be served before the deadline are subject to rejections. Particularly, in the proposed framework, a manager agent learns to divide mTSPTWR into sub-routing tasks by assigning customers to each vehicle via a Graph Isomorphism Network (GIN) based policy network. A worker agent learns to solve sub-routing tasks by minimizing the cost in terms of both tour length and rejection rate for each vehicle, the maximum of which is then fed back to the manager agent to learn better assignments. Experimental results demonstrate that the proposed framework outperforms strong baselines in terms of higher solution quality and shorter computation time. More importantly, the trained agents also achieve competitive performance for solving unseen larger instances.

show abstract

Homotopy Method for a General Multiobjective Programming Problem

Song

Yao

2008

J Optim Theory Appl

View full text Add to dashboard Cite

Dual-Aspect Self-Attention Based on Transformer for Remaining Useful Life Prediction

Zhang

Song

2022

IEEE Trans. Instrum. Meas.

View full text Add to dashboard Cite

Deep Reinforcement Learning for Solving the Heterogeneous Capacitated Vehicle Routing Problem

Gao

et al. 2022

IEEE Trans. Cybern.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Wen Song

Learning Improvement Heuristics for Solving Routing Problems

The Moreau envelope function and proximal mapping in the sense of the Bregman distance

Robust distributed optimization for energy dispatch of multi-stakeholder multiple microgrids under uncertainty

Heterogeneous Attentions for Solving Pickup and Delivery Problem via Deep Reinforcement Learning

Learning to Solve Multiple-TSP With Time Window and Rejections via Deep Reinforcement Learning

Homotopy Method for a General Multiobjective Programming Problem

Dual-Aspect Self-Attention Based on Transformer for Remaining Useful Life Prediction

Deep Reinforcement Learning for Solving the Heterogeneous Capacitated Vehicle Routing Problem

Contact Info

Product

Resources

About