State-of-the-art decision tree methods apply heuristics recursively to create each split in isolation, which may not capture well the underlying characteristics of the dataset. The optimal decision tree problem attempts to resolve this by creating the entire decision tree at once to achieve global optimality. In the last 25 years, algorithmic advances in integer optimization coupled with hardware improvements have resulted in an astonishing 800 billion factor speedup in mixed-integer optimization (MIO). Motivated by this speedup, we present optimal classification trees, a novel formulation of the decision tree problem using modern MIO techniques that yields the optimal decision tree for axes-aligned splits. We also show the richness of this MIO formulation by adapting it to give optimal classification trees with hyperplanes that generates optimal decision trees with multivariate splits. Synthetic tests demonstrate that these methods recover the true decision tree more closely than heuristics, refuting the notion that optimal methods overfit the training data. We comprehensively benchmark these methods on a sample of 53 datasets from the UCI machine learning repository. We establish that these MIO methods are practically solvable on real-world datasets with sizes in the 1000s, and give average absolute improvements in out-of-sample accuracy over CART of 1-2 and 3-5% for the univariate and multivariate cases, respectively. Furthermore, we identify that optimal classification trees are likely to outperform CART by 1.2-1.3% in situations where the CART accuracy is high and we have sufficient training data, while the multivariate version outperforms CART by 4-7% when the CART accuracy or dimension of the dataset is low.
POTTER is a highly accurate and user-friendly ES risk calculator with the potential to continuously improve accuracy with ongoing machine-learning. POTTER might prove useful as a tool for bedside preoperative counseling of ES patients and families.
Motivated by the fact that there may be inaccuracies in features and labels of training data, we apply robust optimization techniques to study in a principled way the uncertainty in data features and labels in classification problems and obtain robust formulations for the three most widely used classification methods: support vector machines, logistic regression, and decision trees. We show that adding robustness does not materially change the complexity of the problem and that all robust counterparts can be solved in practical computational times. We demonstrate the advantage of these robust formulations over regularized and nominal methods in synthetic data experiments, and we show that our robust classification methods offer improved out-of-sample accuracy. Furthermore, we run large-scale computational experiments across a sample of 75 data sets from the University of California Irvine Machine Learning Repository and show that adding robustness to any of the three nonregularized classification methods improves the accuracy in the majority of the data sets. We observe the most significant gains for robust classification methods on high-dimensional and difficult classification problems, with an average improvement in out-of-sample accuracy of robust versus nominal problems of 5.3% for support vector machines, 4.0% for logistic regression, and 1.3% for decision trees.
Motivated by personalized decision making, given observational data [Formula: see text] involving features [Formula: see text], assigned treatments or prescriptions [Formula: see text], and outcomes [Formula: see text], we propose a tree-based algorithm called optimal prescriptive tree (OPT) that uses either constant or linear models in the leaves of the tree to predict the counterfactuals and assign optimal treatments to new samples. We propose an objective function that balances optimality and accuracy. OPTs are interpretable and highly scalable, accommodate multiple treatments, and provide high-quality prescriptions. We report results involving synthetic and real data that show that OPTs either outperform or are comparable with several state-of-the-art methods. Given their combination of interpretability, scalability, generalizability, and performance, OPTs are an attractive alternative for personalized decision making in a variety of areas, such as online advertising and personalized medicine.
Purpose With rapidly evolving treatment options in cancer, the complexity in the clinical decision-making process for oncologists represents a growing challenge magnified by oncologists’ disposition of intuition-based assessment of treatment risks and overall mortality. Given the unmet need for accurate prognostication with meaningful clinical rationale, we developed a highly interpretable prediction tool to identify patients with high mortality risk before the start of treatment regimens. Methods We obtained electronic health record data between 2004 and 2014 from a large national cancer center and extracted 401 predictors, including demographics, diagnosis, gene mutations, treatment history, comorbidities, resource utilization, vital signs, and laboratory test results. We built an actionable tool using novel developments in modern machine learning to predict 60-, 90- and 180-day mortality from the start of an anticancer regimen. The model was validated in unseen data against benchmark models. Results We identified 23,983 patients who initiated 46,646 anticancer treatment lines, with a median survival of 514 days. Our proposed prediction models achieved significantly higher estimation quality in unseen data (area under the curve, 0.83 to 0.86) compared with benchmark models. We identified key predictors of mortality, such as change in weight and albumin levels. The results are presented in an interactive and interpretable tool ( www.oncomortality.com ). Conclusion Our fully transparent prediction model was able to distinguish with high precision between highest- and lowest-risk patients. Given the rich data available in electronic health records and advances in machine learning methods, this tool can have significant implications for value-based shared decision making at the point of care and personalized goals-of-care management to catalyze practice reforms.
BACKGROUND: Classic risk assessment tools often treat patients' risk factors as linear and additive. Clinical reality suggests that the presence of certain risk factors can alter the impact of other factors; in other words, risk modeling is not linear. We aimed to use artificial intelligence (AI) technology to design and validate a nonlinear risk calculator for trauma patients. METHODS:A novel, interpretable AI technology called Optimal Classification Trees (OCTs) was used in an 80:20 derivation/validation split of the 2010 to 2016 American College of Surgeons Trauma Quality Improvement Program database. Demographics, emergency department vital signs, comorbidities, and injury characteristics (e.g., severity, mechanism) of all blunt and penetrating trauma patients 18 years or older were used to develop, train then validate OCT algorithms to predict in-hospital mortality and complications (e.g., acute kidney injury, acute respiratory distress syndrome, deep vein thrombosis, pulmonary embolism, sepsis). A smartphone application was created as the algorithm's interactive and user-friendly interface. Performance was measured using the c-statistic methodology. RESULTS:A total of 934,053 patients were included (747,249 derivation; 186,804 validation). The median age was 51 years, 37% were women, 90.5% had blunt trauma, and the median Injury Severity Score was 11. Comprehensive OCT algorithms were developed for blunt and penetrating trauma, and the interactive smartphone application, Trauma Outcome Predictor (TOP) was created, where the answer to one question unfolds the subsequent one. Trauma Outcome Predictor accurately predicted mortality in penetrating injury (c-statistics: 0.95 derivation, 0.94 validation) and blunt injury (c-statistics: 0.89 derivation, 0.88 validation). The validation c-statistics for predicting complications ranged between 0.69 and 0.84. CONCLUSION: We suggest TOP as an AI-based, interpretable, accurate, and nonlinear risk calculator for predicting outcome in trauma patients.Trauma Outcome Predictor can prove useful for bedside counseling of critically injured trauma patients and their families, and for benchmarking the quality of trauma care.
Giant-cell tumor of the bone (GCTB) is a rare neoplasm that affects young adults. The tumor is generally benign but sometimes can be locally aggressive. There are no standardized approaches to the treatment of GCTB. Recently, the RANKL inhibitor denosumab has shown activity in this tumor type. We present the case of a young female who presented with locally advanced disease and was successfully managed with the neoadjuvant use of denosumab allowing for surgical resection of the tumor that was previously deemed unresectable. Following surgery, the patient is being managed with continued use of denosumab as ‘maintenance,' and she continues to be free of disease. Our case highlights a novel approach for the management of locally advanced and aggressive giant cell tumor of the bone.
IMPORTANCE Computed tomographic (CT) scanning is the standard for the rapid diagnosis of intracranial injury, but it is costly and exposes patients to ionizing radiation. The Pediatric Emergency Care Applied Research Network (PECARN) rules for identifying children with minor head trauma who are at very low risk of clinically important traumatic brain injury (ciTBI) are widely used to triage CT imaging. OBJECTIVE To examine whether optimal classification trees (OCTs), which are novel machine-learning classifiers, improve on PECARN rules' predictive accuracy. DESIGN, SETTING, AND PARTICIPANTS A secondary analysis of prospective, publicly available data on emergency department visits for head trauma used by the PECARN group to develop their tool was conducted to derive OCT-based prediction rules for ciTBI in a development cohort and compare their predictive performance vs the PECARN rules in a validation cohort among children who were younger than 2 years and 2 years or older. Data on 42 412 children with head trauma and without severely altered mental status who were examined between
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.