A Maximum Entropy Framework for Semisupervised and Active Learning With Unknown and Label-Scarce Classes

Qiu, Zhicong; Miller, David J.; Kesidis, George

doi:10.1109/tnnls.2016.2514401

Cited by 27 publications

(18 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…2. We develop a novel unsupervised AD that goes well beyond [16], i) modeling the joint density of a deep layer using highly suitable null hypothesis density models (matched in particular to non-negative support for RELU layers); ii) exploiting multiple DNN layers; iii) leveraging a "source" and "destination" class concept, source class uncertainty, the class confusion matrix, comprehensive low-order density modelling [38], and DNN weight information in constructing a novel decision statistic grounded in the Kullback-Leibler divergence.…”

Section: Contributions Of This Workmentioning

confidence: 99%

When Not to Classify: Anomaly Detection of Attacks (ADA) on DNN Classifiers at Test Time

2019

Self Cite

View full text Add to dashboard Cite

A significant threat to the recent, wide deployment of machine learning-based systems, including deep neural networks (DNNs), is adversarial learning attacks. The main focus here is on evasion attacks against (DNN-based) classifiers at test time. While much work has focused on devising attacks that make perturbations to a test pattern (e.g., an image) which are humanimperceptible and yet still induce a change in the classifier's decision, until recently there has been relative paucity of work in defending against such attacks. Some works robustify the classifier to make correct decisions on perturbed patterns. This is an important objective for some applications involving evasion attacks and for "natural adversary" scenarios. However, we analyze the possible evasion attack mechanisms and show that, in some important cases, when the image has been attacked, correctly classifying it has no utility: i) when the image to be 1 This work was supported in part by a gift from Cisco and a grant from the DDDAS program at AFOSR. arXiv:1712.06646v2 [cs.LG] 28 Jun 2018 attacked is (even arbitrarily) selected from the attacker's cache; ii) when the sole recipient of the classifier's decision is the attacker. Moreover, in some application domains and scenarios it is highly actionable to detect the attack irrespective of correctly classifying in the face of it (with classification still performed if no attack is detected). We hypothesize that, even if humanimperceptible, adversarial perturbations are machine-detectable. We propose a purely unsupervised anomaly detector (AD) that, unlike previous works: i) models the joint density of a deep layer using highly suitable null hypothesis density models (matched in particular to the nonnegative support for RELU layers); ii) exploits multiple DNN layers; iii) leverages a "source" and "destination" class concept, source class uncertainty, the class confusion matrix, and DNN weight information in constructing a novel decision statistic grounded in the Kullback-Leibler divergence. Tested on MNIST and CIFAR-10 image databases under three prominent attack strategies, our approach outperforms previous detection methods, achieving strong ROC AUC detection accuracy on two attacks and better accuracy than recently reported for a variety of methods on the strongest (CW) attack. We also evaluate a fully white box attack on our system. Finally, we evaluate other important measures such as classification accuracy versus detection rate and multiple performance measures versus attack strength.1. Layer and Null Model Choices: [16] chose l = L − 1, the penultimate layer of the DNN,

show abstract

Section: Contributions Of This Workmentioning

confidence: 99%

When Not to Classify: Anomaly Detection of Attacks (ADA) on DNN Classifiers at Test Time

2019

Self Cite

View full text Add to dashboard Cite

show abstract

“…Mixed strategies: Choose the sample by MEU (or random sampling) with probability p; otherwise by uncertainty sampling with probability 1 − p. Note that a non-zero proportion for uncertainty sampling is warranted because uncertainty sampling is a very good mechanism for discovering unknown classes that may be latently present in T u [18]. At the same time, using uncertainty sampling plays into the hands of the attacker.…”

Section: B Sample Selection Criteria For Active Learningmentioning

confidence: 99%

“…Huge false positive rates ensued when as little as 5% of training data consisted of these contrived emails. [12] considered active learning (AL), a promising framework for security applications, as the classifier can adapt to track evolving threats and also because oracle labeling may discover novel classes [18], [19] that may be zero-day threats. [12] demonstrated, using SVMs, that if an adversary "salts" the unlabeled data batch in a biased fashion near the current decision boundary (where AL seeks to choose samples for labeling), one can induce classifier degradation -each adversarial sample was chosen such that, if labeled, it will decrease accuracy the most.…”

Section: Introductionmentioning

confidence: 99%

“…Also, particularly in IDS settings with unknown unknowns, the full complement of classes is a priori unknown and one may discover new classes in an unlabeled or semisupervised data batch [13]. This is especially true in an AL context, starting from few labeled examples [18]. Finally, many security applications in fact involve authentication, not classification per se - [14], [3] assume the problem is classification.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Adversarial learning: A critical review and active learning study

Miller

Qiu

et al. 2017

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)

View full text Add to dashboard Cite

show abstract

“…e greatest challenge for AL methods is identifying the most informative samples so that the maximum prediction accuracy can be achieved. A number of sample-selection criteria have then been applied to this task, including (1) query-by-committee (QBC), in which several distinct classifiers are used and the selected samples are those with the largest difference between the labels predicted by different classifiers [9][10][11]; (2) margin uncertainty sampling, wherein the samples are selected according to the maximum uncertainty based on their respective distances from the classification boundaries [12,13]; (3) max-entropy sampling, which uses entropy as the uncertainty measure via probabilistic modeling [14,15]; and (4) diversity sampling, which prefers selecting representative samples [16].…”

Section: Introductionmentioning

confidence: 99%

Double-Criteria Active Learning for Multiclass Brain-Computer Interfaces

She

Chen

Luo

et al. 2020

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

Recent technological advances have enabled researchers to collect large amounts of electroencephalography (EEG) signals in labeled and unlabeled datasets. It is expensive and time consuming to collect labeled EEG data for use in brain-computer interface (BCI) systems, however. In this paper, a novel active learning method is proposed to minimize the amount of labeled, subject-specific EEG data required for effective classifier training, by combining measures of uncertainty and representativeness within an extreme learning machine (ELM). Following this approach, an ELM classifier was first used to select a relatively large batch of unlabeled examples, whose uncertainty was measured through the best-versus-second-best (BvSB) strategy. The diversity of each sample was then measured between the limited labeled training data and previously selected unlabeled samples, and similarity is measured among the previously selected samples. Finally, a tradeoff parameter is introduced to control the balance between informative and representative samples, and these samples are then used to construct a powerful ELM classifier. Extensive experiments were conducted using benchmark and multiclass motor imagery EEG datasets to evaluate the efficacy of the proposed method. Experimental results show that the performance of the new algorithm exceeds or matches those of several state-of-the-art active learning algorithms. It is thereby shown that the proposed method improves classifier performance and reduces the need for training samples in BCI applications.

show abstract

A Maximum Entropy Framework for Semisupervised and Active Learning With Unknown and Label-Scarce Classes

Cited by 27 publications

References 22 publications

When Not to Classify: Anomaly Detection of Attacks (ADA) on DNN Classifiers at Test Time

When Not to Classify: Anomaly Detection of Attacks (ADA) on DNN Classifiers at Test Time

Adversarial learning: A critical review and active learning study

Double-Criteria Active Learning for Multiclass Brain-Computer Interfaces

Contact Info

Product

Resources

About