2011 IEEE International Conference on Information Reuse &Amp; Integration 2011
DOI: 10.1109/iri.2011.6009529
|View full text |Cite
|
Sign up to set email alerts
|

Information extraction from spam emails using stylistic and semantic features to identify spammers

Abstract: Abstract

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 11 publications
(15 citation statements)
references
References 3 publications
(4 reference statements)
0
14
0
Order By: Relevance
“…First, the Trfr(Term frequency) and Indfr(Inverse Document frequency) for the top n most frequent words used in the dataset and second, the count of the top n bigrams used in the dataset, where n is the number that is decided based upon the cutoff of the minimum frequency count. Trfr & Indfr is a statistical measure that can be used to represent the importance of a term in a document [1]. It first remove all the stop words from the emails and then Trfr-Indfr can be calculated.…”
Section: Semantic Parametersmentioning
confidence: 99%
See 3 more Smart Citations
“…First, the Trfr(Term frequency) and Indfr(Inverse Document frequency) for the top n most frequent words used in the dataset and second, the count of the top n bigrams used in the dataset, where n is the number that is decided based upon the cutoff of the minimum frequency count. Trfr & Indfr is a statistical measure that can be used to represent the importance of a term in a document [1]. It first remove all the stop words from the emails and then Trfr-Indfr can be calculated.…”
Section: Semantic Parametersmentioning
confidence: 99%
“…It is required to measure cluster quality. The purity percentage is evaluated by following equation [1]:…”
Section: Puritymentioning
confidence: 99%
See 2 more Smart Citations
“…Various approaches to feature extraction have been researched, with mixed results. Recent developments suggest that both semantic and statistical features can be used to cluster email text [10], and this technique is explored further in this paper.…”
Section: Definitions and Problem Statementmentioning
confidence: 99%