Predicting electricity prices and demand is a very important issue for the energy market industry. In order to improve the accuracy of any predictive model, a previous variable importance analysis is highly advised. In this paper, we propose an alternative framework to assess the variable importance in multivariate response scenarios based on the permutation importance technique, applying the Conditional inference trees algorithm and a ϕ -divergence measure. Our solution was tested in simulated examples as well as a real case, where we assessed and ranked the most relevant predictors for price and demand of electricity jointly in the Spanish market. The new method outperforms, in most cases, the outcomes achieved by the recently proposed techniques, Intervention prediction measure (IPM) and Sequential multi-response feature selection (SMuRFS). For the electricity market case, we identified the most relevant predictors among pollutant, renewable, calendar and lagged prices variables for the joint response of demand and price, showing also the effectiveness of the proposed multivariate response method when compared with the univariate response analysis.
Decision-making using machine learning requires a deep understanding of the model under analysis. Variable importance analysis provides the tools to assess the importance of input variables when dealing with complex interactions, making the machine learning model more interpretable and computationally more efficient. In classification problems with imbalanced datasets, this task is even more challenging. In this article, we present two variable importance techniques, a nonparametric solution, called mh-χ 2 , and a parametric method based on Global Sensitivity Analysis. The mh-χ 2 employs a multivariate continuous response framework to deal with the multiclass classification problem. Based on the permutation importance framework, the proposed mh-χ 2 algorithm captures the dissimilarities between the distribution of misclassification errors generated by the base learner, Conditional Inference Tree, before and after permuting the values of the input variable under analysis. The GSA solution is based on the Covariance decomposition methodology for multivariate output models. Both solutions will be assessed in a comparative study of several Random Forest-based techniques with emphasis in the multiclass classification problem with different imbalanced scenarios. We apply the proposed techniques in two real application cases in order first, to quantify the importance of the 35 companies listed in the Spanish market index IBEX35 on the economic, political and social uncertainties reflected in economic newspapers in Spain during the first quadrimester of 2020 due to the COVID-19 pandemic and second, to assess the impact of energy factors on the occurrence of spike prices on the Spanish electricity market.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.