“…For wrapper approaches, many existing works used only the classification performance as the fitness function [11], [134], [135], [137], [138], [139], [140], [142], [160], which led to relatively large feature subsets. However, most of the fitness functions used different ways to combine both the classification performance and the number of features into a single fitness function [70], [136], [141], [180], [148], [181]. However, it is difficult to determine in advance the optimal balance between them without a priori knowledge.…”