Dolores Romero Morales scite author profile

Data Mining techniques often ask for the resolution of optimization problems. Supervised Classification, and, in particular, Support Vector Machines, can be seen as a paradigmatic instance. In this paper, some links between Mathematical Optimization methods and Supervised Classification are emphasized. It is shown that many different areas of Mathematical Optimization play a central role in off-the-shelf Supervised Classification methods. Moreover, Mathematical Optimization turns out to be extremely useful to address important issues in Classification, such as identifying relevant variables, improving the interpretability of classifiers or dealing with vagueness/noise in the data.

show abstract

Mathematical optimization in classification and regression trees

Carrizosa

Molero-Río

Morales

2021

TOP

View full text Add to dashboard Cite

Classification and regression trees, as well as their variants, are off-the-shelf methods in Machine Learning. In this paper, we review recent contributions within the Continuous Optimization and the Mixed-Integer Linear Optimization paradigms to develop novel formulations in this research area. We compare those in terms of the nature of the decision variables and the constraints required, as well as the optimization algorithms proposed. We illustrate how these powerful formulations enhance the flexibility of tree models, being better suited to incorporate desirable properties such as cost-sensitivity, explainability, and fairness, and to deal with complex data, such as functional data.

show abstract

Forecasting cancellation rates for services booking revenue management using data mining

Morales

Wang

2010

European Journal of Operational Research

View full text Add to dashboard Cite

Sparsity in optimal randomized classification trees

Blanquero

Carrizosa

Molero-Río

et al. 2020

European Journal of Operational Research

View full text Add to dashboard Cite

Decision trees are popular Classification and Regression tools and, when small-sized, easy to interpret. Traditionally, a greedy approach has been used to build the trees, yielding a very fast training process; however, controlling sparsity (a proxy for interpretability) is challenging.In recent studies, optimal decision trees, where all decisions are optimized simultaneously, have shown a better learning performance, especially when oblique cuts are implemented. In this paper, we propose a continuous optimization approach to build sparse optimal classification trees, based on oblique cuts, with the aim of using fewer predictor variables in the cuts as well as along the whole tree. Both types of sparsity, namely local and global, are modeled by means of regularizations with polyhedral norms. The computational experience reported supports the usefulness of our methodology. In all our data sets, local and global sparsity can be improved without harming classification accuracy. Unlike greedy approaches, our ability to easily trade in some of our classification accuracy for a gain in global sparsity is shown.

show abstract

A class of greedy algorithms for the generalized assignment problem

Romeijn

Morales

2000

Discrete Applied Mathematics

View full text Add to dashboard Cite

show abstract

Optimal randomized classification trees

Blanquero

Carrizosa

Molero-Río

et al. 2021

Computers & Operations Research

View full text Add to dashboard Cite

Integrated Lot Sizing in Serial Supply Chains with Production Capacities

et al. 2005

View full text Add to dashboard Cite

We consider a model for a serial supply chain in which production, inventory, and transportation decisions are integrated in the presence of production capacities and concave cost functions. The model we study generalizes the uncapacitated serial single-item multilevel economic lot-sizing model by adding stationary production capacities at the manufacturer level. We present algorithms with a running time that is polynomial in the planning horizon when all cost functions are concave. In addition, we consider different transportation and inventory holding cost structures that yield improved running times: inventory holding cost functions that are linear and transportation cost functions that are either linear, or are concave with a fixed-charge structure. In the latter case, we make the additional common and reasonable assumption that the variable transportation and inventory costs are such that holding inventories at higher levels in the supply chain is more attractive from a variable cost perspective. While the running times of the algorithms are exponential in the number of levels in the supply chain in the general concave cost case, the running times are remarkably insensitive to the number of levels for the other two cost structures.lot sizing, integration of production planning and transportation, dynamic programming, polynomial time algorithms

show abstract

Binarized Support Vector Machines

Carrizosa

Martín-Barragán

Morales

2010

INFORMS Journal on Computing

View full text Add to dashboard Cite

The widely used Support Vector Machine (SVM) method has shown to yield very good results in Supervised Classification problems. Other methods such as Classification Trees have become more popular among practitioners than SVM thanks to their interpretability, which is an important issue in Data Mining.In this work, we propose an SVM-based method that automatically detects the most important predictor variables, and the role they play in the classifier. In particular, the proposed method is able to detect those values and intervals which are critical for the classification. The method involves the optimization of a Linear Programming problem, with a large number of decision variables. The numerical experience reported shows that a rather direct use of the standard Column-Generation strategy leads to a classification method which, in terms of classification ability, is competitive against the standard linear SVM and Classification Trees. Moreover, the proposed method is robust, i.e., it is stable in the presence of outliers and invariant to change of scale or measurement units of the predictor variables.When the complexity of the classifier is an important issue, a wrapper feature selection method is applied, yielding simpler, still competitive, classifiers. In this work, we propose an SVM-based method that automatically detects the most important predictor variables, and the role they play in the classifier. In particular, the proposed method is able to detect those values and intervals which are critical for the classification. The method involves the optimization of a Linear KeywordsProgramming problem with a large number of decision variables. The numerical experience reported shows that a rather direct use of the standard Column-Generation strategy leads to a classification method which, in terms of classification ability, is competitive against the standard linear SVM and Classification Trees. Moreover, the proposed method is robust, i.e., it is stable in the presence of outliers and invariant to change of scale or * This work has been partially supported by projects MTM2005-09362-C03-01 of MEC, Spain, and FQM-329 of Junta de Andalucía, Spain.1 measurement units of the predictor variables.When the complexity of the classifier is an important issue, a wrapper feature selection method is applied, yielding simpler, still competitive, classifiers.

show abstract

12 3 4 5 6

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.