Learning Robust Search Strategies Using a Bandit-Based Approach

Xia, Wei; Yap, Roland H. C.

doi:10.1609/aaai.v32i1.12211

Cited by 9 publications

(4 citation statements)

References 16 publications

(22 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The table also shows the true ranks of predicted values. 6 In general we see that the first three predicted ranks are close to the true first rank based on the average size of the dataset. For example, for DS on RCPSP the 1 st predicted rank has as true 1 st rank 706 which is 706/41730 ⋅ 100 = 1.7% from the top rank.…”

Section: Evaluating Deepified Heuristicsmentioning

confidence: 63%

See 1 more Smart Citation

Online learning of variable ordering heuristics for constraint optimisation problems

Doolaard

Yorke-Smith

2022

Ann Math Artif Intell

View full text Add to dashboard Cite

Solvers for constraint optimisation problems exploit variable and value ordering heuristics. Numerous expert-designed heuristics exist, while recent research learns novel, customised heuristics from past problem instances. This article addresses unseen problems for which no historical data is available. We propose one-shot learning of customised, problem instance-specific heuristics. To do so, we introduce the concept of deep heuristics, a data-driven approach to learn extended versions of a given variable ordering heuristic online. First, for a problem instance, an initial online probing phase collects data, from which a deep heuristic function is learned. The learned heuristics can look ahead arbitrarily-many levels in the search tree instead of a ‘shallow’ localised lookahead of classical heuristics. A restart-based search strategy allows for multiple learned models to be acquired and exploited in the solver’s optimisation. We demonstrate deep variable ordering heuristics based on the smallest, anti first-fail, and maximum regret heuristics. Results on instances from the MiniZinc benchmark suite show that deep heuristics solve 20% more problem instances while improving on overall runtime for the Open Stacks and Evilshop benchmark problems.

show abstract

Section: Evaluating Deepified Heuristicsmentioning

confidence: 63%

“…Various variable ordering heuristics have been designed by human experts [2][3][4]. Recent work also acquires dedicated heuristics using machine learning (ML), or learns which of a given set of heuristic to use [5][6][7][8]. However, both classical and learned heuristics are based on the current search node.…”

Section: Introductionmentioning

confidence: 99%

Online learning of variable ordering heuristics for constraint optimisation problems

Doolaard

Yorke-Smith

2022

Ann Math Artif Intell

View full text Add to dashboard Cite

show abstract

“…Rewards updated through ERWA were also used to adaptively select a backtracking strategy in [Bac+15]. Furthermore, MAB frameworks were used to select a search heuristic among a set of candidate ones at each node of the search tree in [XY18] or at each restart in [Wat+20;Kor+22]. Simple bandit-driven perturbation strategies to incorporate random choices in constraint solving with restarts were also introduced and evaluated in [PW20].…”

Section: Minimax Optimal Strategy In the Stochastic Case (Moss) [Ab09]mentioning

confidence: 99%

Reasoning and inference for (Maximum) satisfiability: new insights

Cherif

2023

Constraints

View full text Add to dashboard Cite

At the heart of computer science and artificial intelligence, logic is often used as a powerful language to model and solve complex problems that arise in academia and in real-world applications. A well-known formalism in this context is the Satisfiability (SAT) problem which simply checks whether a given propositional formula in the form of a set of constraints, called clauses, can be satisfied. A natural optimization extension of this problem is Maximum Satisfiability (Max-SAT) which consists in determining the maximum number of clausal constraints that can be satisfied within the formula. In our work, we are interested in studying the power and limits of inference and resoning in the context of (Maximum) Satisfiability. Our first contributions revolve around investigating inference in SAT and Max-SAT solving. First, we study statistical inference within a Multi-Armed Bandit (MAB) framework for online selection of branching heuristics in SAT and we show that it can further enhance the efficiency of modern clause-learning solvers. Moreover, we provide further insights on the power of inference in Branch and Bound algorithms for Max-SAT solving through the property of UP-resilience. Our contributions also extend to SAT and Max-SAT proof theory. We particularly attempt to theoretically bridge the gap between SAT and Max-SAT inference.

show abstract

“…This question naturally calls for a "bandit" approach, as recently advocated in [Xia and Yap, 2018;Wattez et al, 2020]. Multi-armed bandit problems are sequential decision tasks in which the learning algorithm has access to a set of arms, and observes the reward for the chosen arm after each trial.…”

Section: Introductionmentioning

confidence: 99%

Best Heuristic Identification for Constraint Satisfaction

Vernerey

Loudni

Aribi³

et al. 2022

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence

View full text Add to dashboard Cite

Constraint-based pattern mining is at the core of numerous data mining tasks. Unfortunately, thresholds which are involved in these constraints cannot be easily chosen. This paper investigates a Multi-objective Optimization approach where several (often conflicting) functions need to be optimized at the same time. We introduce a new model for efficiently mining Pareto optimal patterns with constraint programming. Our model exploits condensed pattern representations to reduce the mining effort. To this end, we design a new global constraint for ensuring the closeness of patterns over a set of measures. We show how our approach can be applied to derive high-quality non redundant association rules without the use of thresholds whose added-value is studied on both UCI datasets and case study related to the analysis of genes expression data integrating multiple external genes annotations.

show abstract

Learning Robust Search Strategies Using a Bandit-Based Approach

Cited by 9 publications

References 16 publications

Online learning of variable ordering heuristics for constraint optimisation problems

Online learning of variable ordering heuristics for constraint optimisation problems

Reasoning and inference for (Maximum) satisfiability: new insights

Best Heuristic Identification for Constraint Satisfaction

Contact Info

Product

Resources

About