Mixed Integer Programming (MIP) solvers rely on an array of sophisticated heuristics developed with decades of research to solve large-scale MIP instances encountered in practice. Machine learning offers to automatically construct better heuristics from data by exploiting shared structure among instances in the data. This paper applies learning to the two key sub-tasks of a MIP solver, generating a high-quality joint variable assignment, and bounding the gap in objective value between that assignment and an optimal one.Our approach constructs two corresponding neural network-based components, Neural Diving and Neural Branching, to use in a base MIP solver such as SCIP. Neural Diving learns a deep neural network to generate multiple partial assignments for its integer variables, and the resulting smaller MIPs for un-assigned variables are solved with SCIP to construct high quality joint assignments. Neural Branching learns a deep neural network to make variable selection decisions in branch-and-bound to bound the objective value gap with a small tree. This is done by imitating a new variant of Full Strong Branching we propose that scales to large instances using GPUs. We evaluate our approach on six diverse real-world datasets, including two Google production datasets and MIPLIB, by training separate neural networks on each. Most instances in all the datasets combined have 10 3 − 10 6 variables and constraints after presolve, which is significantly larger than previous learning approaches. Comparing solvers with respect to primal-dual gap averaged over a held-out set of instances, the learning-augmented SCIP is 2× to 10× better on all datasets except one on which it is 10 5 × better, at large time limits. To the best of our knowledge, ours is the first learning approach to demonstrate such large improvements over SCIP on both large-scale real-world application datasets and MIPLIB.
Robustness to distribution shifts is critical for deploying machine learning models in the real world. Despite this necessity, there has been little work in defining the underlying mechanisms that cause these shifts and evaluating the robustness of algorithms across multiple, different distribution shifts. To this end, we introduce a framework that enables fine-grained analysis of various distribution shifts. We provide a holistic analysis of current state-of-the-art methods by evaluating 19 distinct methods grouped into five categories across both synthetic and real-world datasets. Overall, we train more than 85K models. Our experimental framework can be easily extended to include new methods, shifts, and datasets. We find, unlike previous work (Gulrajani & Lopez-Paz, 2021), that progress has been made over a standard ERM baseline; in particular, pretraining and augmentations (learned or heuristic) offer large gains in many cases. However, the best methods are not consistent over different datasets and shifts.
Domain generalization remains a ubiquitous challenge for machine learning in healthcare. Model performance in real-world conditions might be lower than expected due to discrepancies between the data encountered in deployment environments and datasets used for model development. Under-representation of some groups or conditions during model development is a common cause of this phenomenon, which can have serious implications as it can exacerbate bias against groups, individuals or conditions and propagate unintended harms in their care. This challenge is often not readily addressed by targeted data acquisition and “labelling” by expert clinicians, which can be prohibitively expensive or practically impossible due to the rarity of diseases, conditions, or available clinical expertise. We hypothesize that advances in generative artificial intelligence may help mitigate this unmet need in a steerable fashion, algorithmically enriching our training dataset with synthetic examples that address shortfalls of underrepresented conditions or subgroups. We show that generative models can automatically learn realistic augmentations from data in a label-efficient manner. In particular, we leverage the higher abundance of unlabelled data to model the underlying distribution of different conditions and subgroups for an imaging modality. By conditioning generative models on appropriate labels (e.g., diagnostic labels and / or sensitive attribute labels), we can steer the distribution of synthetic examples according to specific requirements. We demonstrate that these learned augmentations make models more robust and statistically fair in- and out-of-distribution. To evaluate the generality of our approach, we study three distinct medical imaging contexts of varying difficulty: (i) histopathology images from a publicly available and widely adopted generalization benchmark, (ii) chest X-rays from publicly available clinical datasets, and (iii) dermatology images characterized by complex shifts and imaging conditions. The latter constitutes a particularly unstructured domain with various challenges. Two of these imaging modalities further require operating at a high-resolution, which requires developing faithful super-resolution techniques to recover fine details of each health condition. Complementing real training samples with synthetic ones improves the robustness of models in all three medical tasks and increases fairness by improving the accuracy of clinical diagnosis within underrepresented groups. Our proposed approach leads to stark improvements out-of-distribution across modalities: 7.7% prediction accuracy improvement in histopathology, 5.2% in chest radiology with 44.6% lower fairness gap and a striking 63.5% improvement in high-risk sensitivity for dermatology with a 7.5x reduction in fairness gap.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.