2009
DOI: 10.1007/978-3-642-02490-0_66
|View full text |Cite
|
Sign up to set email alerts
|

An Evaluation of Machine Learning-Based Methods for Detection of Phishing Sites

Abstract: Abstract. In this paper, we evaluate the performance of machine learningbased methods for detection of phishing sites. In our previous work [1], we attempted to employ a machine learning technique to improve the detection accuracy. Our preliminary evaluation showed the AdaBoost-based detection method can achieve higher detection accuracy than the traditional detection method. Here, we evaluate the performance of 9 machine learning techniques including AdaBoost, Bagging, Support Vector Machines, Classification … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
39
0
2

Year Published

2012
2012
2021
2021

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 64 publications
(41 citation statements)
references
References 5 publications
0
39
0
2
Order By: Relevance
“…We have two reasons of employing AdaBoost. One is that it had performed better in our previous comparative study [5], where it demonstrated the lowest error rate, the highest f 1 measure, and the highest AUC of the AdaBoost-based detection method, as mentioned in Section 2. The other is that we expect AdaBoost to cover each user's weak points.Assuming that a user's trust decision can be treated as a classifier, AdaBoost would cover users' weak points by assigning high weights to heuristics that can correctly judge a site that the user is likely to misjudge.…”
Section: Theoretical Backgroundmentioning
confidence: 88%
See 1 more Smart Citation
“…We have two reasons of employing AdaBoost. One is that it had performed better in our previous comparative study [5], where it demonstrated the lowest error rate, the highest f 1 measure, and the highest AUC of the AdaBoost-based detection method, as mentioned in Section 2. The other is that we expect AdaBoost to cover each user's weak points.Assuming that a user's trust decision can be treated as a classifier, AdaBoost would cover users' weak points by assigning high weights to heuristics that can correctly judge a site that the user is likely to misjudge.…”
Section: Theoretical Backgroundmentioning
confidence: 88%
“…Our previous work [5] employed nine machine learning techniques for detecting phishing sites. By employing eight heuristics presented by CANTINA, we analyzed 3000 URLs, consisting of 1500 legitimate sites and the same number of phishing sites, reported on PhishTank.com [6] from November 2007 to February 2008.…”
Section: Detection Methods For Phishing Sitesmentioning
confidence: 99%
“…In contrast to the blacklist method, a heuristic based solution can recognize freshly created phishing websites in real time (Miyamoto, Hazeyama and Kadobayashi 2008). The effectiveness of the heuristic based methods, sometimes called features-based methods, depends on picking a set of discriminative features that could help in distinguishing the type of website (Guang, Jason, et al 2011).…”
Section: Technical Solutionmentioning
confidence: 99%
“…In 2010, a survey presented in [7] aimed to evaluate the performance of machine-learning-based-detection-methods including: "AdaBoost, Bagging, SVM, Classification and Regression Trees, Logistic Regression, Random Forests, NN, Naive Bayes and Bayesian Additive Regression Trees" showed that 7 out of 9 of machine-learning-based-detection-methods outperformed CANTINA in predicting phishing websites those are: "AdaBoost, Bagging, Logistic Regression, Random Forests, Neural Networks, Naive Bayes and Bayesian Additive Regression Trees". A dataset consisting of 1500 phishing websites and 1500 legitimate websites used in the experiments.…”
Section: Tf-idf (Term Frequency-inverse Document Frequency)mentioning
confidence: 99%
“…In contrast to the blacklist method, a heuristic-based solution can recognize freshly created phishing websites in real-time [7]. The effectiveness of the heuristic-based methods, sometimes called features-based methods, depends on picking a set of discriminative features that could help in distinguishing the type of website [8].…”
Section: Introductionmentioning
confidence: 99%