The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2020 International Joint Conference on Neural Networks (IJCNN) 2020
DOI: 10.1109/ijcnn48605.2020.9207707
|View full text |Cite
|
Sign up to set email alerts
|

HTMLPhish: Enabling Phishing Web Page Detection by Applying Deep Learning Techniques on HTML Analysis

Abstract: Recently, the development and implementation of phishing attacks require little technical skills and costs. This uprising has led to an ever-growing number of phishing attacks on the World Wide Web. Consequently, proactive techniques to fight phishing attacks have become extremely necessary. In this paper, we propose HTMLPhish, a deep learning based datadriven end-to-end automatic phishing web page classification approach. Specifically, HTMLPhish receives the content of the HTML document of a web page and empl… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 47 publications
(23 citation statements)
references
References 19 publications
(23 reference statements)
0
15
0
Order By: Relevance
“…Although both models have the HTML feature extraction process, the presented model is not using any URL feature extraction with the use of expert knowledge, which is another benefit getting over the benchmarked model. The latest approach introduced to the phishing area is the HTMLPhish (Opara et al, 2019). It achieved the detection accuracy of 97.2%, and that accuracy is also low compared to proposed model accuracy.…”
Section: Resultsmentioning
confidence: 97%
See 1 more Smart Citation
“…Although both models have the HTML feature extraction process, the presented model is not using any URL feature extraction with the use of expert knowledge, which is another benefit getting over the benchmarked model. The latest approach introduced to the phishing area is the HTMLPhish (Opara et al, 2019). It achieved the detection accuracy of 97.2%, and that accuracy is also low compared to proposed model accuracy.…”
Section: Resultsmentioning
confidence: 97%
“…As a solution for this manual feature extraction, deep learning techniques were tried out to implement automated feature extraction processes in the past. HTMLPhish (Opara et al, 2019) was such an attempt that used Recurrent Neural Network (RNN) to automated feature extraction process from HTML pages. It used only HTML pages in the detection process and achieved 97.2% detection accuracy.…”
Section: Software-based Detectionmentioning
confidence: 99%
“…Opara et al [10] proposed the use of characters embedding and string embedding techniques to represent features of each HTML, then this representation is used as input to a Convolutional Neural Network (CNN) in order to model semantic dependencies. They collect their own data from Alexa and Phishtank, reporting two sets of data, the first one with 23000 legitimate websites and 2300 phishing websites used for training, and the second one with 24000 legitimate websites and 2400 phishing websites used for testing, these datasets are not available.…”
Section: Automatic Featuresmentioning
confidence: 99%
“…The URL and HTML strings are tokenized using a character corpus that includes punctuation marks, then, this tokenized data is processed into a character embedding matrix. They use the datasets presented in their previous work, [10], reporting an accuracy of 98.00% and an F1 score of 98.00%.…”
Section: Automatic Featuresmentioning
confidence: 99%
“…Many studies focus on the detection of desktop malicious webpages 7‐10 . These existing solutions can effectively detect the malicious web pages on the desktop devices.…”
Section: Introductionmentioning
confidence: 99%