2018
DOI: 10.1016/j.ins.2017.10.049
|View full text |Cite
|
Sign up to set email alerts
|

Concept drift in e-mail datasets: An empirical study with practical implications

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 28 publications
(15 citation statements)
references
References 25 publications
0
15
0
Order By: Relevance
“…The calculation details will be introduced as follows. Note that the usage of word stemming (Step 3) would cause a reduction in the performance of classification techniques owing to the generation of over‐steaming and under‐steaming errors [41 ]. The adoption of word ontologies such as WordNet or other alternatives can substitute for the usage of word stemming [42, 43 ].…”
Section: Effective Two‐phase Hybrid Spam Filtering Methods Integratimentioning
confidence: 99%
“…The calculation details will be introduced as follows. Note that the usage of word stemming (Step 3) would cause a reduction in the performance of classification techniques owing to the generation of over‐steaming and under‐steaming errors [41 ]. The adoption of word ontologies such as WordNet or other alternatives can substitute for the usage of word stemming [42, 43 ].…”
Section: Effective Two‐phase Hybrid Spam Filtering Methods Integratimentioning
confidence: 99%
“…Naïve Bayes algorithm [16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34] is simple and based on probability theorem. It [23] is a statistical classifier also known as the Naive Bayesian classifier that can be used for classification of data.…”
Section: Naïve Bayes (Nb)mentioning
confidence: 99%
“…The performance measures [19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36][37][38] play a very important role in checking the robustness of a model. We have calculated accuracy, sensitivity, specificity, precision, F-score and ROC curve using different parameters of the confusion matrix.…”
Section: Performance Measuresmentioning
confidence: 99%
“…The concept drift phenomenon may occur in various real-world data and corresponding applications ( Žliobaite, Pechenizkiy & Gama, 2016 ): computer systems or networks, through network intrusion detection, where new techniques and methods may appear ( Liu et al, 2017 ; Mukkavilli & Shetty, 2012 ); industry, when dynamic data streams are produced by sensors in production equipment and machines ( Lin et al, 2019 ; Zenisek, Holzinger & Affenzeller, 2019 ); marketing and management, when users change their buying behavior and their preferences ( Black & Hickey, 2003 ; Chiang, Wang & Chu, 2013 ; Lo et al, 2018 ); medical data, e.g., in the case of antibiotic resistance ( Stiglic & Kokol, 2011 ; Tsymbal et al, 2006 ); social networks, when users change their behavior and generated content ( Lifna & Vijayalakshmi, 2015 ; Li et al, 2016 ); spam categorization, where spam keywords can change over time ( Delany et al, 2005 ; Ruano-Ordás, Fdez-Riverola & Méndez, 2018 ). …”
Section: Introductionmentioning
confidence: 99%
“…spam categorization, where spam keywords can change over time ( Delany et al, 2005 ; Ruano-Ordás, Fdez-Riverola & Méndez, 2018 ).…”
Section: Introductionmentioning
confidence: 99%