It has been widely observed that there is no single "dominant" SAT solver; instead, different solvers perform best on different instances. Rather than following the traditional approach of choosing the best solver for a given class of instances, we advocate making this decision online on a per-instance basis. Building on previous work, we describe SATzilla, an automated approach for constructing per-instance algorithm portfolios for SAT that use socalled empirical hardness models to choose among their constituent solvers. This approach takes as input a distribution of problem instances and a set of component solvers, and constructs a portfolio optimizing a given objective function (such as mean runtime, percent of instances solved, or score in a competition). The excellent performance of SATzilla was independently verified in the 2007 SAT Competition, where our SATzilla07 solvers won three gold, one silver and one bronze medal. In this article, we go well beyond SATzilla07 by making the portfolio construction scalable and completely automated, and improving it by integrating local search solvers as candidate solvers, by predicting performance score instead of runtime, and by using hierarchical hardness models that take into account different types of SAT instances. We demonstrate the effectiveness of these new techniques in extensive experimental results on data sets including instances from the most recent SAT competition.
Perhaps surprisingly, it is possible to predict how long an algorithm will take to run on a previously unseen input, using machine learning techniques to build a model of the algorithm's runtime as a function of problem-specific instance features. Such models have important applications to algorithm analysis, portfolio-based algorithm selection, and the automatic configuration of parameterized algorithms. Over the past decade, a wide variety of techniques have been studied for building such models. Here, we describe extensions and improvements of existing models, new families of models, andperhaps most importantly-a much more thorough treatment of algorithm parameters as model inputs. We also comprehensively describe new and existing features for predicting algorithm runtime for propositional satisfiability (SAT), travelling salesperson (TSP) and mixed integer programming (MIP) problems. We evaluate these innovations through the largest empirical analysis of its kind, comparing to a wide range of runtime modelling techniques from the literature. Our experiments consider 11 algorithms and 35 instance distributions; they also span a very wide range of SAT, MIP, and TSP instances, with the least structured having been generated uniformly at random and the most structured having emerged from real industrial applications. Overall, we demonstrate that our new models yield substantially better runtime predictions than previous approaches in terms of their generalization to new problem instances, to new algorithms from a parameterized space, and to both simultaneously. 102 21 59 109 6E3 1E3 1E3 6E3 Minisat 2.0-HAND 1903 3600 1.25 0.57 0.53 0.52 0.51 74 14 48 79 3E3 96 179 3E3 Minisat 2.0-RAND 2497 3600 0.82 0.39 0.38 0.37 0.37 59 23 52 65 221 35 145 230 Minisat 2.0-INDU 1146 3600 0.94 0.58 0.57 0.55 0.52 222 24 85 231 6E3 1E3 1E3 6E3 Minisat 2.0-SWV-IBM 466 3600 0.85 0.16 0.17 0.17 0.17 98 8.4 59 102 1E3 74 153 1E3 Minisat 2.0-IBM 834 3600 1.1 0.21 0.25 0.21 0.19 130 11 78 136 1E3 74 153 1E3 Minisat 2.0-SWV 0.89 5.32 0.25 0.08 0.09 0.08 0.08 57 4.9 34 59 217 17 123 226 CryptoMinisat-INDU 1921 3600 1.1 0.81 0.73 0.74 0.72 222 24 85 231 6E3 1E3 1E3 6E3 CryptoMinisat-SWV-IBM 873 3600 1.07 0.47 0.5 0.49 0.48 98 8.4 59 102 1081 74 153 1103 CryptoMinisat-IBM 1178 3600 1.2 0.42 0.45 0.42 0.41 130 11 78 136 1081 74 153 1103 CryptoMinisat-SWV 486 3600 0.89 0.51 0.53 0.49 0.51 57 4.9 34 59 217 17 123 226 SPEAR-INDU 1685 3600 1.01 0.67 0.62 0.61 0.58 222 24 85 231 6E3 1E3 1E3 6E3 SPEAR-SWV-IBM 587 3600 0.97 0.38 0.39 0.39 0.38 98 8.4 59 102 1E3 74 153 1E3 SPEAR-IBM 1004 3600 1.18 0.39 0.42 0.42 0.38 130 11 78 136 1E3 74 153 1E3 SPEAR-SWV 60 3600 0.54 0.36 0.34 0.34 0.34 57 4.9 34 59 217 17 123 226 tnm-RANDSAT 568 3600 1.05 0.88 0.97 0.9 0.88 63 26 56 70 221 35 145 230 SAPS-RANDSAT 1019 3600 1 0.67 0.71 0.65 0.66 63 26 56 70 221 35 145 230 CPLEX-BIGMIX 719 3600 0.96 0.84 0.85 0.63 0.64 17 0.13 6.7 23 1E4 6.6 54 1E4 Gurobi-BIGMIX 992 3600 1.31 1.28 1.31 1.19 1.17 17 0.13 6.7 23 1E4 6.6 54 1E4 SCIP-BIGMIX 1153 3600 0.77 0.67 0.72 0.58...
Abstract. It has been widely observed that there is no "dominant" SAT solver; instead, different solvers perform best on different instances. Rather than following the traditional approach of choosing the best solver for a given class of instances, we advocate making this decision online on a per-instance basis. Building on previous work, we describe a per-instance solver portfolio for SAT, SATzilla-07, which uses socalled empirical hardness models to choose among its constituent solvers. We leverage new model-building techniques such as censored sampling and hierarchical hardness models, and demonstrate the effectiveness of our techniques by building a portfolio of state-of-the-art SAT solvers and evaluating it on several widely-studied SAT data sets. Overall, we show that our portfolio significantly outperforms its constituent algorithms on every data set. Our approach has also proven itself to be effective in practice: in the 2007 SAT competition, SATzilla-07 won three gold medals, one silver, and one bronze; it is available online at
This paper presents a new supervised classification algorithm for remotely sensed hyperspectral image (HSI) which integrates spectral and spatial information in a unified Bayesian framework. First, we formulate the HSI classification problem from a Bayesian perspective. Then, we adopt a convolutional neural network (CNN) to learn the posterior class distributions using a patch-wise training strategy to better use the spatial information. Next, spatial information is further considered by placing a spatial smoothness prior on the labels. Finally, we iteratively update the CNN parameters using stochastic gradient decent and update the class labels of all pixel vectors using -expansion min-cut-based algorithm. Compared with the other state-of-the-art methods, the classification method achieves better performance on one synthetic data set and two benchmark HSI data sets in a number of experimental settings.
Abstract. Portfolio-based methods exploit the complementary strengths of a set of algorithms and-as evidenced in recent competitions-represent the state of the art for solving many NP-hard problems, including SAT. In this work, we argue that a state-of-the-art method for constructing portfolio-based algorithm selectors, SATzilla, also gives rise to an automated method for quantifying the importance of each of a set of available solvers. We entered a substantially improved version of SATzilla to the inaugural "analysis track" of the 2011 SAT competition, and draw two main conclusions from the results that we obtained. First, automatically-constructed portfolios of sequential, non-portfolio competition entries perform substantially better than the winners of all three sequential categories. Second, and more importantly, a detailed analysis of these portfolios yields valuable insights into the nature of successful solver designs in the different categories. For example, we show that the solvers contributing most to SATzilla were often not the overall best-performing solvers, but instead solvers that exploit novel solution strategies to solve instances that would remain unsolved without them.
Collaborative filtering recommender systems (CFRSs) are the key components of successful e-commerce systems. Actually, CFRSs are highly vulnerable to attacks since its openness. However, since attack size is far smaller than that of genuine users, conventional supervised learning based detection methods could be too "dull" to handle such imbalanced classification. In this paper, we improve detection performance from following two aspects. First, we extract well-designed features from user profiles based on the statistical properties of the diverse attack models, making hard classification task becomes easier to perform. Then, refer to the general idea of re-scale Boosting (RBoosting) and AdaBoost, we apply a variant of AdaBoost, called the rescale AdaBoost (RAdaBoost) as our detection method based on extracted features. RAdaBoost is comparable to the optimal Boosting-type algorithm and can effectively improve the performance in some hard scenarios. Finally, a series of experiments on the MovieLens-100K data set are conducted to demonstrate the outperformance of RAdaBoost comparing with some classical techniques such as SVM, kNN and AdaBoost. 2[29] and AdaBoost [9, 10], we apply a variant of Boosting algorithm, called the re-scale AdaBoost (RAdaBoost) as our detection method based on extracted features. RBoosting is theoretically and experimentally proved to be better than the classical Boosting algorithm [17]. Furthermore, the theoretical near optimality of the numerical convergence of RBoosting among all the variants of the Boosting-type algorithms was also specified. This means that if the parameter is appropriately selected, RBoosting is comparable to the optimal Boosting-type algorithm. And AdaBoost [9, 10] is one of the most popular ensemble techniques paradigm and has been shown to be very effective in practice in some hard scenarios [13]. Typically, AdaBoost employs re-weighted loss function for gradually increasing emphasis (or weights) on misclassifications (i.e., concerned attackers) and can distinctly improve the predictive performance on a difficult data set. Thus, with the help of the re-scale operator, RAdaBoost can be used in conjunction with many other types of learning algorithms (or weak learners) to improve the performance in "shilling" attacks detection. Finally, a series of experiments on the MovieLens-100K dataset are conducted to demonstrate the outperformance (i.e., classification error, detection rate and false alarm rate) of RAdaBoost comparing with conventional classification techniques such as SVM, kNN and the original non-rescale AdaBoost version. The experimental results show that RAdaBoost can effectively improve the performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.