“…The problem has continued to draw an increasing attention from researchers and practitioners alike [7,8,10,18,[21][22][23][24][25][26][27][28]. Although choosing a subset of features from the original features is a combinatorial problem, many suboptimal heuristics have been put forward and used in various domains, which include the chi-squared based feature subset selection [7,8,10], the analysis of variance (ANOVA) [7,8,10], mutual information [7,23,29] and information gain [18,[25][26][27]. Many studies have shown that feature selection approaches that select good feature subset will have significant impact on reducing the complexity in processing by eliminating unimportant features and enhance the performance of the learning models [24,30].…”