Feature selection is an important problem for pattern classifier systems. As compared to unsupervised feature selection methods, supervised feature selection approaches have better performance when the given training samples with supervised information are sufficient. However, in reality, usually only a few labeled data are obtained , since obtaining class labels is expensive but many unlabeled data can be easily gotten. For this case, directly using the existing supervised feature selection algorithms may be failed because the data distribution may not be accurately estimated only by using a few labeled data. So, in this paper, we introduce a semi-supervised method for feature selection, called Semi_Fisher Score, the new model attempts to effectively simultaneously utilize all labeled and unlabeled samples for improving the performance of the classical Fisher Score. Experiments on 4 UCI datasets by using three different classifiers(KNN,RBFNN and C4.5)show the effectiveness of our algorithm.
In recent years, face recognition is being recognized as a cost-sensitive learning problem. Many cost-sensitive classifiers have been proposed. However, no sufficient attention is paid to the research on cost-sensitive dimensionality reduction, especially on the cost-sensitive semisupervised dimensionality reduction. To the best of our knowledge, cost sensitive semisupervised discriminant analysis (CS3DA) may be the first work. CS3DA first uses the sparse representation to infer a soft label for unlabeled sample and then learns the projection direction by incorporating misclassification costs into both labeled and unlabeled data. Although CS3DA reduces the loss of misclassification, it has two major drawbacks: 1) the sparsity is not a feature of face recognition, and therefore sparse approximations may not deliver the robustness or performance desired and 2) CS3DA is not proven to satisfy the minimal misclassification loss criterion. In this paper, we embed pairwise costs in semisupervised discriminant analysis (PCSDA) for face recognition. PCSDA first uses a simple l 2 approach to predict the label of unlabeled data, and then learns the projection direction by embedding pairwise costs in both labeled and unlabeled data. Compared with CS3DA, PCSDA has three major advantages: 1) l 2 approach is more accurate and robust than sparse representation for face recognition; 2) we prove that CS3DA approximates the pairwise Bayesian risk only when the classes are balanced and without outliers in face data sets; and 3) PCSDA approximates the pairwise Bayesian risk considering the class imbalance problem and outliers in face recognition. Hence, the projection direction obtained by using PCSDA can be more discriminative, immunes to outliers and class imbalance problem. The experimental results on AR, PIE, ORL, and extended Yale B data sets demonstrate the effectiveness of PCSDA.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.