2020
DOI: 10.1109/access.2020.3008416
|View full text |Cite
|
Sign up to set email alerts
|

Variable Importance Analysis in Imbalanced Datasets: A New Approach

Abstract: Decision-making using machine learning requires a deep understanding of the model under analysis. Variable importance analysis provides the tools to assess the importance of input variables when dealing with complex interactions, making the machine learning model more interpretable and computationally more efficient. In classification problems with imbalanced datasets, this task is even more challenging. In this article, we present two variable importance techniques, a nonparametric solution, called mh-χ 2 , a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
0
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 87 publications
0
0
0
Order By: Relevance
“…RF is a nonparametric ML approach for multiclass classification and regression problems [95]. The RF is a wellrecognized and hugely successful bagging type of decision tree ensemble.…”
Section: Random Forestsmentioning
confidence: 99%
“…RF is a nonparametric ML approach for multiclass classification and regression problems [95]. The RF is a wellrecognized and hugely successful bagging type of decision tree ensemble.…”
Section: Random Forestsmentioning
confidence: 99%
“…One of the essential outputs of a machine learning algorithm is variable importance (VI) [1]- [2]. This VI indicates the importance of the predictor variables.…”
Section: Introductionmentioning
confidence: 99%
“…As the machine learning accuracy highly dependant upon this stage, for imbalance datasets, accuracy performance becomes a challenging task. Variable importance measurement techniques described in [10], highlights usage of imbalance dataset and shows out performance of proposed method. To achieve high dimensional selection consistency in decision tree algorithm, researchers have presented model selection algorithm named DSTUMP that outperforms in nonlinear additive model settings [10].…”
Section: Introductionmentioning
confidence: 99%
“…Importance combined with prior knowledge parameters to select features, when applied to soft measuring model, these features have shown increase in performance, as stated in [18]. Similarly permutation based framework, a dissimilarities based algorithm is proposed by [10] researchers that computes variable importance using distribution of misclassification errors. In the area of image classification, researchers have proposed method of quantifying of variable importance, employing concept of game theory and metric of Shapely value which is applicable to any type of model [19].…”
Section: Introductionmentioning
confidence: 99%