Imbalance learning is a hot topic in the data mining and machine learning domains. Data-level, algorithm-level and ensemble solutions are the three main methods proposed thus far to address imbalance learning. To alleviate the issues of data explosion and feature selection in multilayer perceptron based on simultaneous two-sample representation(S2SMLP), in this paper, firstly, spectral clustering is exploited to select majority samples so as to construct a smaller training dataset for the classifier. We divide all majority samples into many clusters through spectral clustering, extract different numbers of representative samples from a cluster according to the size of each cluster, the average distance between the minority class and all samples of the cluster, then construct the training dataset of the classifier by combining these extracted samples from the majority class and all minority samples. Secondly, we propose a novel feature selection method based on the pairwise samples distance constraint, which considers the class labels of paired samples, select the features which push two similar samples closer together and pull two different samples farther apart. Finally, we conduct extensive experiments on 44 two-class imbalanced datasets and four highdimensional DNA microarray datasets. The experimental results demonstrate that our proposed algorithms outperform some state-of-the-art algorithms in terms of F-measure,-mean and AUC. INDEX TERMS Multilayer perceptron, Under-sampling, Spectral clustering, Imbalance learning, feature selection, information gain I. INTRODUCTION Classification is one of the hot topics of machine learning domains, its main task is to learn a classification model from training data and predict the labels of unknown samples. To date, many classification models have been proposed and are widely used in various real-world applications, e.g., naive Bayes (NB), logistic regression (LR), support vector machine (SVM) have been successfully employed in spam recognition, bank loan credit scoring and network rumor recognition, respectively. The success of traditional classification model is usually supported by the assumption that different classes in an original dataset are balanced [1], but the assumption is not always true. In other words, there is a class-imbalanced problem in a dataset [2], one class contains a large number of samples, which is called the majority class (or negative class), and the other class has a very small number of samples, which is called the minority class (or positive class), and IR is the imbalance ratio defined as follows.