Collaborative filtering recommender systems (CFRSs) are the key components of successful e-commerce systems. Actually, CFRSs are highly vulnerable to attacks since its openness. However, since attack size is far smaller than that of genuine users, conventional supervised learning based detection methods could be too "dull" to handle such imbalanced classification. In this paper, we improve detection performance from following two aspects. First, we extract well-designed features from user profiles based on the statistical properties of the diverse attack models, making hard classification task becomes easier to perform. Then, refer to the general idea of re-scale Boosting (RBoosting) and AdaBoost, we apply a variant of AdaBoost, called the rescale AdaBoost (RAdaBoost) as our detection method based on extracted features. RAdaBoost is comparable to the optimal Boosting-type algorithm and can effectively improve the performance in some hard scenarios. Finally, a series of experiments on the MovieLens-100K data set are conducted to demonstrate the outperformance of RAdaBoost comparing with some classical techniques such as SVM, kNN and AdaBoost.
2[29] and AdaBoost [9, 10], we apply a variant of Boosting algorithm, called the re-scale AdaBoost (RAdaBoost) as our detection method based on extracted features. RBoosting is theoretically and experimentally proved to be better than the classical Boosting algorithm [17]. Furthermore, the theoretical near optimality of the numerical convergence of RBoosting among all the variants of the Boosting-type algorithms was also specified. This means that if the parameter is appropriately selected, RBoosting is comparable to the optimal Boosting-type algorithm. And AdaBoost [9, 10] is one of the most popular ensemble techniques paradigm and has been shown to be very effective in practice in some hard scenarios [13]. Typically, AdaBoost employs re-weighted loss function for gradually increasing emphasis (or weights) on misclassifications (i.e., concerned attackers) and can distinctly improve the predictive performance on a difficult data set. Thus, with the help of the re-scale operator, RAdaBoost can be used in conjunction with many other types of learning algorithms (or weak learners) to improve the performance in "shilling" attacks detection. Finally, a series of experiments on the MovieLens-100K dataset are conducted to demonstrate the outperformance (i.e., classification error, detection rate and false alarm rate) of RAdaBoost comparing with conventional classification techniques such as SVM, kNN and the original non-rescale AdaBoost version. The experimental results show that RAdaBoost can effectively improve the performance.