Recent studies in using deep learning to solve the Travelling Salesman Problem (TSP) focus on construction heuristics, the solution of which may still be far from optimality. To improve solution quality, additional procedures such as sampling or beam search are required. However, they are still based on the same construction policy, which is less effective in refining a solution. In this paper, we propose to directly learn the improvement heuristics for solving TSP based on deep reinforcement learning. We first present a reinforcement learning formulation for the improvement heuristic, where the policy guides selection of the next solution. Then, we propose a deep architecture as the policy network based on self-attention. Extensive experiments show that, improvement policies learned by our approach yield better results than state-of-the-art methods, even from random initial solutions. Moreover, the learned policies are more effective than the traditional hand-crafted ones, and robust to different initial solutions with either high or poor quality.
Recent deep models for solving routing problems always assume a single distribution of nodes for training, which severely impairs their cross-distribution generalization ability. In this paper, we exploit group distributionally robust optimization (group DRO) to tackle this issue, where we jointly optimize the weights for different groups of distributions and the parameters for the deep model in an interleaved manner during training. We also design a module based on convolutional neural network, which allows the deep model to learn more informative latent pattern among the nodes. We evaluate the proposed approach on two types of well-known deep models including GCN and POMO. The experimental results on the randomly synthesized instances and the ones from two benchmark dataset (i.e., TSPLib and CVRPLib) demonstrate that our approach could significantly improve the cross-distribution generalization performance over the original models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.