Jakob Richter scite author profile

Machine-learning algorithms have gained popularity in recent years in the field of ecological modeling due to their promising results in predictive performance of classification problems. While the application of such algorithms has been highly predictors.Results show that GAM and Random Forest (RF) (mean AUROC estimates 0.708 and 0.699) outperform all other methods in predictive accuracy. The effect of hyperparameter tuning saturates at around 50 iterations for this data set. The AUROC differences between the bias-reduced (spatial cross-validation) and overoptimistic (non-spatial cross-validation) performance estimates of the GAM and RF are 0.167 (24%) and 0.213 (30%), respectively. It is recommended to also use spatial partitioning for cross-validation hyperparameter tuning of spatial data. The models developed in this study enhance the detection of Diplodia sapinea in the Basque Country compared to previous studies.

show abstract

Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges

Bischl

Binder

Lang

et al. 2023

WIREs Data Min & Knowl

104

View full text Add to dashboard Cite

Most machine learning algorithms are configured by a set of hyperparameters whose values must be carefully chosen and which often considerably impact performance. To avoid a time-consuming and irreproducible manual process of trial-anderror to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods-for example, based on resampling error estimation for supervised machine learning-can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods, from simple techniques such as grid or random search to more advanced methods like evolution strategies, Bayesian optimization, Hyperband, and racing. This work gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with machine learning pipelines, runtime improvements, and parallelization.

show abstract

Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges

Bischl¹,

Binder²,

Lang³

et al. 2021

Preprint

View full text Add to dashboard Cite

Most machine learning algorithms are configured by one or several hyperparameters that must be carefully chosen and often considerably impact performance. To avoid a time consuming and unreproducible manual trial-and-error process to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods, e.g., based on resampling error estimation for supervised machine learning, can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods such as grid or random search, evolution strategies, Bayesian optimization, Hyperband and racing. It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization.

show abstract

Improving adaptive seamless designs through Bayesian optimization

2022

View full text Add to dashboard Cite

We propose to use Bayesian optimization (BO) to improve the efficiency of the design selection process in clinical trials. BO is a method to optimize expensive black‐box functions, by using a regression as a surrogate to guide the search. In clinical trials, planning test procedures and sample sizes is a crucial task. A common goal is to maximize the test power, given a set of treatments, corresponding effect sizes, and a total number of samples. From a wide range of possible designs, we aim to select the best one in a short time to allow quick decisions. The standard approach to simulate the power for each single design can become too time consuming. When the number of possible designs becomes very large, either large computational resources are required or an exhaustive exploration of all possible designs takes too long. Here, we propose to use BO to quickly find a clinical trial design with high power from a large number of candidate designs. We demonstrate the effectiveness of our approach by optimizing the power of adaptive seamless designs for different sets of treatment effect sizes. Comparing BO with an exhaustive evaluation of all candidate designs shows that BO finds competitive designs in a fraction of the time.

show abstract

Faster Model-Based Optimization Through Resource-Aware Scheduling Strategies

Richter

Kotthaus

Bischl

et al. 2016

View full text Add to dashboard Cite

Model-based optimization of subgroup weights for survival analysis

Richter

Madjar

Rahnenführer

2019

View full text Add to dashboard Cite

Motivation To obtain a reliable prediction model for a specific cancer subgroup or cohort is often difficult due to limited sample size and, in survival analysis, due to potentially high censoring rates. Sometimes similar data from other patient subgroups are available, e.g. from other clinical centers. Simple pooling of all subgroups can decrease the variance of the predicted parameters of the prediction models, but also increase the bias due to heterogeneity between the cohorts. A promising compromise is to identify those subgroups with a similar relationship between covariates and target variable and then include only these for model building. Results We propose a subgroup-based weighted likelihood approach for survival prediction with high-dimensional genetic covariates. When predicting survival for a specific subgroup, for every other subgroup an individual weight determines the strength with which its observations enter into model building. MBO (model-based optimization) can be used to quickly find a good prediction model in the presence of a large number of hyperparameters. We use MBO to identify the best model for survival prediction of a specific subgroup by optimizing the weights for additional subgroups for a Cox model. The approach is evaluated on a set of lung cancer cohorts with gene expression measurements. The resulting models have competitive prediction quality, and they reflect the similarity of the corresponding cancer subgroups, with both weights close to 0 and close to 1 and medium weights. Availability and implementation mlrMBO is implemented as an R-package and is freely available at http://github.com/mlr-org/mlrMBO.

show abstract

Model-based optimization with concept drifts

Richter

Shi

Chen

et al. 2020

View full text Add to dashboard Cite

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jakob Richter

mlr3: A modern object-oriented machine learning framework in R

Hyperparameter tuning and performance assessment of statistical and machine-learning algorithms using spatial data

Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges

Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges

Improving adaptive seamless designs through Bayesian optimization

Faster Model-Based Optimization Through Resource-Aware Scheduling Strategies

Model-based optimization of subgroup weights for survival analysis

Model-based optimization with concept drifts

Contact Info

Product

Resources

About