Nowadays, almost every task involving Web traversing and information retrieval recurs to Web robots. Web robots are software programs that automatically traverse the Web's hypertext structure. They proliferate rapidly aside with the growth of the Web and are extremely valuable and important means not only for the large search engines, but also for many specialized services such as investment portals, competitive intelligence tools, etc. While many web robots serve useful purposes, recently, there have been cases linked to fraudulent activities committed by these Web robots. Click fraud, which is the act of generating illegitimate clicks, is one of them. This paper details the architecture and functionality of the Smart ClickBot, a sophisticated software bot that is designed to commit click fraud. It was first detected and reported by NetMosaics Inc. in March, 2010, a real time click fraud detection and prevention solution provider. We discuss the machine learning algorithms used, to identify all clicks exhibiting Smart ClickBot like patterns. We constructed a Bayesian classifier that automatically classifies server log data as being Smart ClickBot or not. We also introduce a Benchmark data set for Smart ClickBot. We disclose the results of our investigation of this bot to educate the security research community and provide information regarding the novelties of the attack.
Multi-sensor data fusion has been an area of intense recent research and development activity. This concept has been applied to numerous fields and new applications are being explored constantly. Multi-sensor based Collaborative Click Fraud Detection and Prevention (CCFDP) system can be viewed as a problem of evidence fusion. In this paper we detail the multi level data fusion mechanism used in CCFDP for real time click fraud detection and prevention. Prevention mechanisms are based on blocking suspicious traffic by IP, referrer, city, country, ISP, etc. Our system maintains an online database of these suspicious parameters. We have tested the system with real-world data from an actual ad campaign where the results show that use of multilevel data fusion improves the quality of click fraud analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.