Universal hypothesis testing in the learning-limited regime

Kelly, Benjamin G.; Tularak, Thitidej; Wagner, Aaron B.; Viswanath, Pramod

doi:10.1109/isit.2010.5513583

Cited by 6 publications

(6 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This classifier was shown in [1] to be asymptotically consistent when N = n and m = o(N 2 ). We now show, however, this classifier has zero generalized error exponent:…”

Section: ℓ 2 -Norm Based Classifier Has a Zero Generalized Error Expo...mentioning

confidence: 92%

“…where η is a large positive constant. The definition of Π m is essentially the same as the α-large-alphabet source defined in [1], except that we allow the number of training and test samples to be different. While this assumption that all words are rare does not hold for English vocabulary, the insights and classifiers obtained for rare words will be used to improve the algorithms for the case when there are both frequent and rare words.…”

Section: Notation and Modelmentioning

confidence: 99%

“…1) The numbers of training and test samples N, n have different effects on the performance, made precise in Theorem IV.1 and Theorem IV.2. 2) The ℓ 2 -norm based classifier investigated in [1], which compares the ℓ 2 distances from the empirical distribution of the test sequence to those of the two training sequences, is sub-optimal in that it has zero generalized error exponent, while a weighted coincidencebased classifier proposed in this paper has a non-zero generalized error exponent.…”

Section: Introductionmentioning

confidence: 98%

“…A widely-used performance criterion is asymptotic consistency: Given some dependence of N, n on m, does the probability of error decay to zero as m increases to infinity? A fundamental result with respect to this criterion was established in [1]: Assuming that the distribution on all symbols in the alphabet is of order 1/m, there exists an asymptotic consistent classifier if and only if m = o(n 2 ). Note that the result is established only for the case N = n.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Classification with high-dimensional sparse samples

Huang

Meyn

2012

2012 IEEE International Symposium on Information Theory Proceedings

View full text Add to dashboard Cite

Abstract-The task of the binary classification problem is to determine which of two distributions has generated a length-n test sequence. The two distributions are unknown; two training sequences of length N , one from each distribution, are observed. The distributions share an alphabet of size m, which is significantly larger than n and N . How does N, n, m affect the probability of classification error? We characterize the achievable error rate in a high-dimensional setting in which N, n, m all tend to infinity, under the assumption that probability of any symbol is O(m

show abstract

“…This classifier was shown in [1] to be asymptotically consistent when N = n and m = o(N 2 ). We now show, however, this classifier has zero generalized error exponent:…”

Section: ℓ 2 -Norm Based Classifier Has a Zero Generalized Error Expo...mentioning

confidence: 92%

Section: Notation and Modelmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 98%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Classification with high-dimensional sparse samples

Huang

Meyn

2012

2012 IEEE International Symposium on Information Theory Proceedings

View full text Add to dashboard Cite

show abstract

“…The high-dimensional model considered in this paper is similar to those investigated in [6,7] and the converse result in this paper is based on a similar proof technique.…”

Section: Related Workmentioning

confidence: 94%

Feature selection for composite hypothesis testing with small samples: Fundamental limits and algorithms

Huang¹,

Meyn²

2012

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

This paper considers the problem of feature selection for composite hypothesis testing: The goal is to select, from m candidate features, r relevant ones for distinguishing the null hypothesis from the composite alternative hypothesis; the training data are given as L sequences of observations, of which each is an n-sample sequence coming from one distribution in the alternative hypothesis. What is the fundamental limit for successful feature selection? Are there any algorithms that achieve this limit? We investigate this problem in a small-sample high-dimensional setting, with n = o(m), and obtain a tight pair of achievability and converse results: (i) There exists a function f (L, n, r, m) such that if f (L, n, r, m) ↓ 0, then no asymptotically consistent feature selection algorithm exists; (ii) We propose a feature selection algorithm that is asymptotically consistent whenever f (L, n, r, m) ↑ ∞.

show abstract

On the Hardness of Domain Adaptation and the Utility of Unlabeled Target Samples

Ben-David

Urner

2012

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Universal hypothesis testing in the learning-limited regime

Cited by 6 publications

References 11 publications

Classification with high-dimensional sparse samples

Classification with high-dimensional sparse samples

Feature selection for composite hypothesis testing with small samples: Fundamental limits and algorithms

On the Hardness of Domain Adaptation and the Utility of Unlabeled Target Samples

Contact Info

Product

Resources

About