2021
DOI: 10.1016/j.cose.2021.102372
|View full text |Cite
|
Sign up to set email alerts
|

Phishing websites detection via CNN and multi-head self-attention on imbalanced datasets

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2
2

Relationship

0
10

Authors

Journals

citations
Cited by 47 publications
(15 citation statements)
references
References 18 publications
0
12
0
Order By: Relevance
“…Ultimately, the authors of [13] attempted to boost that detection accuracy rate through a blended approach of DNN and features weighting algorithms like genetic algorithm (GA) to classify phish websites by their most exploiting features. While researchers of [14], applied a multi-headed and self-attentional CNN on an imbalanced dataset throughout a generative adversarial network (GAN) with a large number of URL features. However, their work fell short of fixing the length of examined URL strings among other URL features.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Ultimately, the authors of [13] attempted to boost that detection accuracy rate through a blended approach of DNN and features weighting algorithms like genetic algorithm (GA) to classify phish websites by their most exploiting features. While researchers of [14], applied a multi-headed and self-attentional CNN on an imbalanced dataset throughout a generative adversarial network (GAN) with a large number of URL features. However, their work fell short of fixing the length of examined URL strings among other URL features.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The results gave a 97 % accuracy for proposed model. Different from the other researchers, the authors in [19] produced a phishing URL to balance the dataset with the GAN. They created a dataset that contained 68,030 legitimate URLs and 12,003 phishing URLs from PhishTank.…”
Section: Related Workmentioning
confidence: 99%
“…However, current learning-based methods tend to model the entire request message as streaming data [3][4][5][6][7][8][9][28][29][30][31][32][33][34], causing the individual presence of a sensitive path or malicious payload to be regarded as the prevailing decision-making factor. As existing methods neglect the implicit processing syntax and scenario-related characteristics, they cannot estimate the attack feasibility when conducting detection on captured malicious requests, which might incur further massive numbers of false alerts, especially during real-world deployment.…”
Section: Http Request Structurementioning
confidence: 99%