Online auctions have become one of the most convenient ways to commit fraud due to a large amount of money being traded every day. Shill bidding is the predominant form of auction fraud, and it is also the most difficult to detect because it so closely resembles normal bidding behavior. Furthermore, shill bidding does not leave behind any apparent evidence, and it is relatively easy to use to cheat innocent buyers. Our goal is to develop a classification model that is capable of efficiently differentiating between legitimate bidders and shill bidders. For our study, we employ an actual training dataset, but the data are unlabeled. First, we properly label the shill bidding samples by combining a robust hierarchical clustering technique and a semi-automated labeling approach. Since shill bidding datasets are imbalanced, we assess advanced over-sampling, under-sampling and hybrid-sampling methods and compare their performances based on several classification algorithms. The optimal shill bidding classifier displays high detection and low misclassification rates of fraudulent activities.
For detecting malicious bidding activities in e-auctions, this study develops a chunk-based incremental learning framework that can operate in real-world auction settings. The self-adaptive framework first classifies incoming bidder chunks to counter fraud in each auction and take necessary actions. The fraud classifier is then adjusted with confident bidders' labels validated via bidder verification and one-class classification. Based on real fraud data produced from commercial auctions, we conduct an extensive experimental study wherein the classifier is adapted incrementally using only relevant bidding data while evaluating the subsequent adjusted models' detection and misclassification rates. We also compare our classifier with static learning and learning without data relevancy. K E Y W O R D S chunk-based incremental learning, fraud detection, imbalanced data, incremental memory model, incremental SGD, one-class SVM
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.