Brain Computer Interfaces provide a very challenging classification task due to small numbers of instances, large numbers of features, non-stationary problems, and low signal-to-noise ratios. Feature selection (FS) is a promising solution to help mitigate these effects. Wrapper FS methods are typically found to outperform filter FS methods, but reliance on cross-validation accuracies can be misleading due to overfitting. This paper proposes a filter-wrapper hybrid based on Iterated Local Search and Mutual Information, and shows that it can provide more reliable solutions, where the solutions are more able to generalise to unseen data. This study further contributes comparisons over multiple datasets, something that has been uncommon in the literature.