Poisoning Complete-Linkage Hierarchical Clustering

Biggio, Battista; Bulò, Samuel Rota; Pillai, Ignazio; Mura, M.; Mequanint, Eyasu Zemene; Pelillo, Marcello; Roli, Fabio

doi:10.1007/978-3-662-44415-3_5

Cited by 41 publications

(44 citation statements)

References 14 publications

(35 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For instance, it has been shown that it is possible to gradually poison a spam filter, an intrusion detection system, and even a biometric verification system (in general, a classification algorithm) by exploiting update mechanisms that enable the adversary to manipulate some of the training data [5][6][7][8][9][10][11][12][13]; and that the detection of malicious samples by linear and even some classes of non-linear classifiers can be evaded with few targeted manipulations that reflect a proper change in their feature values [14,13,[15][16][17]. Recently, poisoning and evasion attacks against clustering algorithms have also been formalized to show that malware clustering approaches can be significantly vulnerable to well-crafted attacks [18,19].…”

Section: Introductionmentioning

confidence: 99%

Support vector machines under adversarial label contamination

et al. 2015

Self Cite

View full text Add to dashboard Cite

Section: Introductionmentioning

confidence: 99%

Support vector machines under adversarial label contamination

et al. 2015

Self Cite

View full text Add to dashboard Cite

“…4, 14, 15. Given the central role of HC in cyber-security it is critical to understand and design around the vulnerabilities of hierarchical clustering methods. An interesting set of articles by Biggio et al 4,16,17 highlights a major vulnerability in HC: sensitivity to poisoning attacks. Biggio et al 4 emphasizes the centrality of clustering of malware families in the identification of common characteristics and the design of suitable countermeasures.…”

Section: Introductionmentioning

confidence: 99%

“…4 Their analysis provides convincing poisoning schemes against other existing algorithms. 16,17 In this paper, we test the hypothesis that, by employing natural entropy based diversity measures developed here, one could counter poisoning attacks against the SLHC method using a fairly simple reactive control mechanism based on allowing only small variations in the above measures for each time step. The idea is that large variations of the measure may only occur as a result of very coarse-grained alterations in the underlying hierarchy.…”

Section: Introductionmentioning

confidence: 99%

Detecting poisoning attacks on hierarchical malware classification systems

et al. 2017

View full text Add to dashboard Cite

Anti-virus software based on unsupervised hierarchical clustering (HC) of malware samples has been shown to be vulnerable to poisoning attacks. In this kind of attack, a malicious player degrades anti-virus performance by submitting to the database samples specifically designed to collapse the classification hierarchy utilized by the anti-virus (and constructed through HC) or otherwise deform it in a way that would render it useless. Though each poisoning attack needs to be tailored to the particular HC scheme deployed, existing research seems to indicate that no particular HC method by itself is immune. We present results on applying a new notion of entropy for combinatorial dendrograms to the problem of controlling the influx of samples into the data base and deflecting poisoning attacks. In a nutshell, effective and tractable measures of change in hierarchy complexity are derived from the above, enabling on-the-fly flagging and rejection of potentially damaging samples. The information-theoretic underpinnings of these measures ensure their indifference to which particular poisoning algorithm is being used by the attacker, rendering them particularly attractive in this setting. ABSTRACT Anti-virus software based on unsupervised hierarchical clustering (HC) of malware samples has been shown to be vulnerable to poisoning attacks. In this kind of attack, a malicious player degrades anti-virus performance by submitting to the database samples specifically designed to collapse the classification hierarchy utilized by the anti-virus (and constructed through HC) or otherwise deform it in a way that would render it useless. Though each poisoning attack needs to be tailored to the particular HC scheme deployed, existing research seems to indicate that no particular HC method by itself is immune. We present results on applying a new notion of entropy for combinatorial dendrograms to the problem of controlling the influx of samples into the data base and deflecting poisoning attacks. In a nutshell, effective and tractable measures of change in hierarchy complexity are derived from the above, enabling on-the-fly flagging and rejection of potentially damaging samples. The information-theoretic underpinnings of these measures ensure their indifference to which particular poisoning algorithm is being used by the attacker, rendering them particularly attractive in this setting.

show abstract

“…However, regardless the behavioral features being used, all these proposals suffer from the same vulnerability: clustering algorithms have not been originally devised to deal with data from an adversary. As outlined in recent work [4,9], this may allow an attacker to devise carefully-crafted attacks that can significantly compromise the clustering process itself, and invalidate subsequent analyses.…”

Section: Introductionmentioning

confidence: 99%

Poisoning behavioral malware clustering

Biggio

Rieck

Ariu

et al. 2014

Proceedings of the 2014 Workshop on Artificial Intelligent and Security Workshop

Self Cite

121

View full text Add to dashboard Cite

Clustering algorithms have become a popular tool in computer security to analyze the behavior of malware variants, identify novel malware families, and generate signatures for antivirus systems. However, the suitability of clustering algorithms for security-sensitive settings has been recently questioned by showing that they can be significantly compromised if an attacker can exercise some control over the input data. In this paper, we revisit this problem by focusing on behavioral malware clustering approaches, and investigate whether and to what extent an attacker may be able to subvert these approaches through a careful injection of samples with poisoning behavior. To this end, we present a case study on Malheur, an open-source tool for behavioral malware clustering. Our experiments not only demonstrate that this tool is vulnerable to poisoning attacks, but also that it can be significantly compromised even if the attacker can only inject a very small percentage of attacks into the input data. As a remedy, we discuss possible countermeasures and highlight the need for more secure clustering algorithms

show abstract

Poisoning Complete-Linkage Hierarchical Clustering

Cited by 41 publications

References 14 publications

Support vector machines under adversarial label contamination

Support vector machines under adversarial label contamination

Detecting poisoning attacks on hierarchical malware classification systems

Poisoning behavioral malware clustering

Contact Info

Product

Resources

About