Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman Problem

Zheng, Jiongzhi; He, Kun; Zhou, Jianrong; Jin, Yan; Li, Chu-Min

doi:10.1609/aaai.v35i14.17476

Cited by 42 publications

(17 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…ML has been applied to solve a number of COPs, including traveling salesman problems (Xin et al, 2021;Zheng et al, 2021), vehicle routing (Kool et al, 2018), boolean satisfiability (Selsam et al, 2018;Amizadeh et al, 2018) and general graph optimization problems (Khalil et al, 2017;Li et al, 2018).…”

Section: Appendix a Additional Related Workmentioning

confidence: 99%

Searching Large Neighborhoods for Integer Linear Programs with Contrastive Learning

Huang¹,

Ferber²,

Tian³

et al. 2023

Preprint

View full text Add to dashboard Cite

Integer Linear Programs (ILPs) are powerful tools for modeling and solving a large number of combinatorial optimization problems. Recently, it has been shown that Large Neighborhood Search (LNS), as a heuristic algorithm, can find high quality solutions to ILPs faster than Branch and Bound. However, how to find the right heuristics to maximize the performance of LNS remains an open problem. In this paper, we propose a novel approach, CL-LNS, that delivers state-of-the-art anytime performance on several ILP benchmarks measured by metrics including the primal gap, the primal integral, survival rates and the best performing rate. Specifically, CL-LNS collects positive and negative solution samples from an expert heuristic that is slow to compute and learns a more efficient one with contrastive learning. We use graph attention networks and a richer set of features to further improve its performance.

show abstract

Section: Appendix a Additional Related Workmentioning

confidence: 99%

Searching Large Neighborhoods for Integer Linear Programs with Contrastive Learning

Huang¹,

Ferber²,

Tian³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Costa et al (2020) [ 19 ] proposed a 2-opt heuristic algorithm combined with deep RL for TSP, which essentially enhances the learning process of 2-opt. Combining the advantages of Q-learning (QL), Sarsa, and the Monte Carlo algorithm, Zheng et al (2020) [ 20 ] proposed a variable strategy reinforced (VSR) approach, optimized the k-opt process of LKH based on this, and designed VSR-LKH for TSP. Optimizing the parameters of the biased random-key genetic algorithm (BRKGA) by QL, Chaves et al (2021) [ 21 ] proposed a BRKGA-QL algorithm for TSP.…”

Section: Introductionmentioning

confidence: 99%

Dynamic sub-route-based self-adaptive beam search Q-learning algorithm for traveling salesman problem

2023

View full text Add to dashboard Cite

In this paper, a dynamic sub-route-based self-adaptive beam search Q-learning (DSRABSQL) algorithm is proposed that provides a reinforcement learning (RL) framework combined with local search to solve the traveling salesman problem (TSP). DSRABSQL builds upon the Q-learning (QL) algorithm. Considering its problems of slow convergence and low accuracy, four strategies within the QL framework are designed first: the weighting function-based reward matrix, the power function-based initial Q-table, a self-adaptive ε-beam search strategy, and a new Q-value update formula. Then, a self-adaptive beam search Q-learning (ABSQL) algorithm is designed. To solve the problem that the sub-route is not fully optimized in the ABSQL algorithm, a dynamic sub-route optimization strategy is introduced outside the QL framework, and then the DSRABSQL algorithm is designed. Experiments are conducted to compare QL, ABSQL, DSRABSQL, our previously proposed variable neighborhood discrete whale optimization algorithm, and two advanced reinforcement learning algorithms. The experimental results show that DSRABSQL significantly outperforms the other algorithms. In addition, two groups of algorithms are designed based on the QL and DSRABSQL algorithms to test the effectiveness of the five strategies. From the experimental results, it can be found that the dynamic sub-route optimization strategy and self-adaptive ε-beam search strategy contribute the most for small-, medium-, and large-scale instances. At the same time, collaboration exists between the four strategies within the QL framework, which increases with the expansion of the instance scale.

show abstract

“…The meta-heuristics include discrete bat algorithm [13], adaptive ACO based on unique strategies [14], improved artificial bee colony algorithm [15], and discrete spider monkey optimization [16]. In addition, the advantages of deep learning based on combining it with TSP to solve the TSP problem [17,18].…”

Section: Introductionmentioning

confidence: 99%

A Meta-Heuristic Routing Optimization Framework for Reducing Traveling Time in Smart Cities

Hai-ying

Xie

Zhu

2023

Preprint

View full text Add to dashboard Cite

Increasing of cities number and urban population has created huge impact on smart cities, intelligent tourism and intelligent logistics. We propose a meta-heuristic routing optimization framework (MROF) based on flower pollination algorithm (FPA) to obtain the planned path with the shortest traveling time. Firstly, we design an iteration-based Gaussian selection method to balance the search capability between global exploration and local exploitation. Then, we design two selection strategies based on Lévy flight and Cauchy distribution to select neighboring structures, in which both self-pollination and cross-pollination require suitable updating strategies to improve search ability. Finally, the tabu strategy, to increase the proposed framework’s search ability, is integrated into the proposed method. The experimental results show that the proposed method outperforms the state-of-the-art algorithms significantly.

show abstract

Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman Problem

Cited by 42 publications

References 25 publications

Searching Large Neighborhoods for Integer Linear Programs with Contrastive Learning

Searching Large Neighborhoods for Integer Linear Programs with Contrastive Learning

Dynamic sub-route-based self-adaptive beam search Q-learning algorithm for traveling salesman problem

A Meta-Heuristic Routing Optimization Framework for Reducing Traveling Time in Smart Cities

Contact Info

Product

Resources

About