2020
DOI: 10.48550/arxiv.2006.01610
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization

Abstract: Combinatorial optimization has found applications in numerous fields, from aerospace to transportation planning and economics. The goal is to find an optimal solution among a finite set of possibilities. The well-known challenge one faces with combinatorial optimization is the state-space explosion problem: the number of possibilities grows exponentially with the problem size, which makes solving intractable for large problems. In the last years, deep reinforcement learning (DRL) has shown its promise for desi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(9 citation statements)
references
References 34 publications
0
7
0
Order By: Relevance
“…While most researches have focused on instances up to 100 nodes, some have attempted scaling to larger instances, which remains challenging (Ma et al, 2019;Fu et al, 2020). Related to our approach, Cappart et al (2020) propose to combine reinforcement learning, constraint programming and dynamic programming and experiment with the TSP with time windows. For surveys of machine learning for routing problems and combinatorial optimization in general, we refer to Mazyavkina et al (2020); Vesselinova et al (2020).…”
Section: Machine Learning For Vehicle Routing Problemsmentioning
confidence: 99%
“…While most researches have focused on instances up to 100 nodes, some have attempted scaling to larger instances, which remains challenging (Ma et al, 2019;Fu et al, 2020). Related to our approach, Cappart et al (2020) propose to combine reinforcement learning, constraint programming and dynamic programming and experiment with the TSP with time windows. For surveys of machine learning for routing problems and combinatorial optimization in general, we refer to Mazyavkina et al (2020); Vesselinova et al (2020).…”
Section: Machine Learning For Vehicle Routing Problemsmentioning
confidence: 99%
“…The references [28,29] provide exact solutions to the problem using constraint programming. A more recent study combining constraint programming and reinforcement learning is presented in [30].…”
Section: Tsp Problem and Its Variantsmentioning
confidence: 99%
“…Pointer network [66] Supervised, Approximation GCN + Search [31] Supervised, Approximation Q-Learning + GNN [19] Model-free, Value-based Hierarchical RL + GAT [44] Model-free, Policy-based REINFORCE + LSTM with attention [47] Model-free, Policy-based REINFORCE + attention [20] Model-free, Policy-based RL + GAT [36] Model-free, Policy-based DDPG [23] Model-free, Policy-based REINFORCE + Pointer network [10] Model-free, Policy-based RL + NN [45] Model-free, Actor-Critic RL + GAT [14] Model-free, Actor-Critic AlphaZero: MCTS + GCN [51] Model-based, Given model Knapsack Problem REINFORCE + Pointer network [10] Model-free, Policy-based Bin Packing Problem (BPP) REINFORCE + LSTM [29] Model-free, Policy-based AlphaZero: MCTS + NN [38] Model-based, Given model Job Scheduling Problem (JSP) RL + LSTM [16] Model-free, Actor-Critic Vehicle Routing Problem (VRP) REINFORCE + LSTM with attention [47] Model-free, Policy-based RL + LSTM [16] Model-free, Policy-based RL + GAT [36] Model-free, Policy-based RL + NN [43] Model-free, Policy-based RL + GAT [25] Model-free, Actor-Critic Global Routing DQN + MLP [40] Model-free, Value-based Highest Safe Rung (HSR) AlphaZero: MCTS + CNN [71] Model-based, Given model Table 3: Classification of ML approaches for NP-hard combinatorial optimization by problem, method, and type.…”
Section: Np-hard Problemmentioning
confidence: 99%