2014
DOI: 10.1007/978-3-662-44415-3_5
|View full text |Cite
|
Sign up to set email alerts
|

Poisoning Complete-Linkage Hierarchical Clustering

Abstract: Abstract. Clustering algorithms are largely adopted in security applications as a vehicle to detect malicious activities, although few attention has been paid on preventing deliberate attacks from subverting the clustering process itself. Recent work has introduced a methodology for the security analysis of data clustering in adversarial settings, aimed to identify potential attacks against clustering algorithms and to evaluate their impact. The authors have shown that single-linkage hierarchical clustering ca… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
44
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
4
3
2

Relationship

3
6

Authors

Journals

citations
Cited by 41 publications
(44 citation statements)
references
References 14 publications
(35 reference statements)
0
44
0
Order By: Relevance
“…For instance, it has been shown that it is possible to gradually poison a spam filter, an intrusion detection system, and even a biometric verification system (in general, a classification algorithm) by exploiting update mechanisms that enable the adversary to manipulate some of the training data [5][6][7][8][9][10][11][12][13]; and that the detection of malicious samples by linear and even some classes of non-linear classifiers can be evaded with few targeted manipulations that reflect a proper change in their feature values [14,13,[15][16][17]. Recently, poisoning and evasion attacks against clustering algorithms have also been formalized to show that malware clustering approaches can be significantly vulnerable to well-crafted attacks [18,19].…”
Section: Introductionmentioning
confidence: 99%
“…For instance, it has been shown that it is possible to gradually poison a spam filter, an intrusion detection system, and even a biometric verification system (in general, a classification algorithm) by exploiting update mechanisms that enable the adversary to manipulate some of the training data [5][6][7][8][9][10][11][12][13]; and that the detection of malicious samples by linear and even some classes of non-linear classifiers can be evaded with few targeted manipulations that reflect a proper change in their feature values [14,13,[15][16][17]. Recently, poisoning and evasion attacks against clustering algorithms have also been formalized to show that malware clustering approaches can be significantly vulnerable to well-crafted attacks [18,19].…”
Section: Introductionmentioning
confidence: 99%
“…4, 14, 15. Given the central role of HC in cyber-security it is critical to understand and design around the vulnerabilities of hierarchical clustering methods. An interesting set of articles by Biggio et al 4,16,17 highlights a major vulnerability in HC: sensitivity to poisoning attacks. Biggio et al 4 emphasizes the centrality of clustering of malware families in the identification of common characteristics and the design of suitable countermeasures.…”
Section: Introductionmentioning
confidence: 99%
“…4 Their analysis provides convincing poisoning schemes against other existing algorithms. 16,17 In this paper, we test the hypothesis that, by employing natural entropy based diversity measures developed here, one could counter poisoning attacks against the SLHC method using a fairly simple reactive control mechanism based on allowing only small variations in the above measures for each time step. The idea is that large variations of the measure may only occur as a result of very coarse-grained alterations in the underlying hierarchy.…”
Section: Introductionmentioning
confidence: 99%
“…However, regardless the behavioral features being used, all these proposals suffer from the same vulnerability: clustering algorithms have not been originally devised to deal with data from an adversary. As outlined in recent work [4,9], this may allow an attacker to devise carefully-crafted attacks that can significantly compromise the clustering process itself, and invalidate subsequent analyses.…”
Section: Introductionmentioning
confidence: 99%