Purpose Metamodels are simplified approximations of more complex models that can be used as surrogates for the original models. Challenges in using metamodels for policy analysis arise when there are multiple correlated outputs of interest. We develop a framework for metamodeling with policy simulations to accommodate multivariate outcomes. Methods: We combine 2 algorithm adaptation methods—multitarget stacking and regression chain with maximum correlation—with different base learners including linear regression (LR), elastic net (EE) with second-order terms, Gaussian process regression (GPR), random forests (RFs), and neural networks. We optimize integrated models using variable selection and hyperparameter tuning. We compare the accuracy, efficiency, and interpretability of different approaches. As an example application, we develop metamodels to emulate a microsimulation model of testing and treatment strategies for hepatitis C in correctional settings. Results Output variables from the simulation model were correlated (average ρ = 0.58). Without multioutput algorithm adaptation methods, in-sample fit (measured by R2) ranged from 0.881 for LR to 0.987 for GPR. The multioutput algorithm adaptation method increased R2 by an average 0.002 across base learners. Variable selection and hyperparameter tuning increased R2 by 0.009. Simpler models such as LR, EE, and RF required minimal training and prediction time. LR and EE had advantages in model interpretability, and we considered methods for improving the interpretability of other models. Conclusions In our example application, the choice of base learner had the largest impact on R2; multioutput algorithm adaptation and variable selection and hyperparameter tuning had a modest impact. Although advantages and disadvantages of specific learning algorithms may vary across different modeling applications, our framework for metamodeling in policy analyses with multivariate outcomes has broad applicability to decision analysis in health and medicine.
Eradicating hunger and malnutrition is a key development goal of the 21st century. We address the problem of optimally identifying seed varieties to reliably increase crop yield within a risk-sensitive decision making framework. Specifically, we introduce a novel hierarchical machine learning mechanism for predicting crop yield (the yield of different seed varieties of the same crop). We integrate this prediction mechanism with a weather forecasting model, and propose three different approaches for decision making under uncertainty to select seed varieties for planting so as to balance yield maximization and risk. We apply our model to the problem of soybean variety selection given in the 2016 Syngenta Crop Challenge. Our prediction model achieves a median absolute error of 3.74 bushels per acre and thus provides good estimates for input into the decision models. Our decision models identify the selection of soybean varieties that appropriately balance yield and risk as a function of the farmer's risk aversion level. More generally, our models support farmers in decision making about which seed varieties to plant.
The cost of testing can be a substantial contributor to hepatitis C virus (HCV) elimination program costs in many low- and middle-income countries such as Georgia, resulting in the need for innovative and cost-effective strategies for testing. Our objective was to investigate the most cost-effective testing pathways for scaling-up HCV testing in Georgia. We developed a Markov-based model with a lifetime horizon that simulates the natural history of HCV, and the cost of detection and treatment of HCV. We then created an interactive online tool that uses results from the Markov-based model to evaluate the cost-effectiveness of different HCV testing pathways. We compared the current standard-of-care (SoC) testing pathway and four innovative testing pathways for Georgia. The SoC testing was cost-saving compared to no testing, but all four new HCV testing pathways further increased QALYs and decreased costs. The pathway with the highest patient follow-up, due to on-site testing, resulted in the highest discounted QALYs (123 QALY more than the SoC) and lowest costs ($127,052 less than the SoC) per 10,000 persons screened. The current testing algorithm in Georgia can be replaced with a new pathway that is more effective while being cost-saving.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.