2014
DOI: 10.1007/978-3-319-12160-4_27
|View full text |Cite
|
Sign up to set email alerts
|

Semantic Feature Selection for Text with Application to Phishing Email Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
18
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
4
4

Relationship

4
4

Authors

Journals

citations
Cited by 23 publications
(18 citation statements)
references
References 19 publications
0
18
0
Order By: Relevance
“…The best phishing emails are hard to detect and tend to avoid detection measures, and smart attackers send these emails to people who are easy targets (e.g., non IT people). In the literature [16,11,14,7,9,13,15], nobody could detect 100% of phishing emails, this shows the limit of the current automatic detection systems. Hence, it is better to include the user either in the detection process, or at least by just sending him a warning to make him pay more attention [10,5].…”
Section: Hypothesismentioning
confidence: 99%
“…The best phishing emails are hard to detect and tend to avoid detection measures, and smart attackers send these emails to people who are easy targets (e.g., non IT people). In the literature [16,11,14,7,9,13,15], nobody could detect 100% of phishing emails, this shows the limit of the current automatic detection systems. Hence, it is better to include the user either in the detection process, or at least by just sending him a warning to make him pay more attention [10,5].…”
Section: Hypothesismentioning
confidence: 99%
“…This work differs from theirs in several respects: we consider many other classifiers than they do and we analyze more lexical features, including the character distributions of the URLs. More on phishing email detection can be found in [24,23].…”
Section: Related Researchmentioning
confidence: 99%
“…We notice from this table that most authors use relatively balanced ratios (e.g., 4 to 6 or 4.5 to 5.5). Examples of research in the literature that used unbalanced datasets with different ratios are [161], [165], [167], [169], [175].…”
Section: Dataset Properties 1) Dataset Sources and Availabilitymentioning
confidence: 99%
“…A few papers also leverage semantics to increase the robustness of the features and increase the performance of classifiers. The following papers [161], [162], [171], [190], [192] reported the use of WordNet 26 to enrich the textual features. Wordnet is a large lexical dataset for English and is used to identify the semantic relationships of the tokens used as features.…”
Section: F Selected Email Detection Literaturementioning
confidence: 99%