Classification of Homogeneous Data With Large Alphabets

Kelly, Benjamin G.; Wagner, Aaron B.; Tularak, Thitidej; Viswanath, Pramod

doi:10.1109/tit.2012.2222343

Cited by 22 publications

(20 citation statements)

References 36 publications

(35 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These results do not provide general tools to approximate the false alarm probability in the finite sample setting except in the case of uniform null disribution. Numerous other similar examples of asymptotically optimal hypothesis tests are found in literature (see, e.g., [3], [7]- [13]). Nevertheless, in a practical experiment involving such hypothesis tests, one has access only to a finite number of observations, and the metric of practical interest is the actual error probability with a finite number of samples rather than the error exponent or asymptotic consistency.…”

Section: Introductionmentioning

confidence: 65%

Weak Convergence Analysis of Asymptotically Optimal Hypothesis Tests

Unnikrishnan

Huang

2016

IEEE Trans. Inform. Theory

View full text Add to dashboard Cite

In recent years solutions to various hypothesis testing problems in the asymptotic setting have been proposed using results from large deviations theory. Such tests are optimal in terms of appropriately defined error-exponents. For the practitioner, however, error probabilities in the finite sample size setting are more important. In this paper we show how results on weak convergence of the test statistic can be used to obtain better approximations for the error probabilities in the finite sample size setting. While this technique is popular among statisticians for common tests, we demonstrate its applicability for several recently proposed asymptotically optimal tests, including tests for robust goodness of fit, homogeneity tests, outlier hypothesis testing, and graphical model estimation.

show abstract

Section: Introductionmentioning

confidence: 65%

Weak Convergence Analysis of Asymptotically Optimal Hypothesis Tests

Unnikrishnan

Huang

2016

IEEE Trans. Inform. Theory

View full text Add to dashboard Cite

show abstract

“…Substituting this into (39) leads to (40) When , we obtain Substituting this into (39) leads to (41) It follows from the bounds (40), (41) and that . Thus, the denominator of (39) satisfies Substituting this into (39) leads to Consequently, To obtain a refined approximation, let , which implies (42) An approximation for will be obtained: since , we have that the numerator and denominator in the summand of (39) satisfy Thus, Substituting this and (42) into (39) leads to which gives (43) The integration in (38) is now carried out along the closed contour given by :…”

Section: A) Approximation To the Logarithmic Moment Generating Functimentioning

confidence: 99%

“…To obtain tight bounds, we use a technique similar to the expurgating method in [40]. The distributions used in proving the bounds are constructed using the mixing of indistinguishable distributions method (see e.g., [5], [41]). …”

Section: Overview Of the Approachmentioning

confidence: 99%

“…We use the mixing of indistinguishable distributions method previously used in proving hardness results for composite and hypothesis testing problems [3], [5], [41]. First, construct a collection of distributions so that for each distribution , the likelihood ratio has a simple expression.…”

Section: B Sketch Of the Proofs For Theorems 1 Andmentioning

confidence: 99%

“…For arbitrary nonuniform null distribution, one possible approach is to choose a different weight. As the results in [3], [41], [43], and [44] implies, the key is to analyze large probability and small probability symbols. A unified result on the large deviations for separable statistics for both large and small probability symbols would serve as a basis for choosing the weight.…”

Section: Nonuniform Null Distributionmentioning

confidence: 99%

See 2 more Smart Citations

Generalized Error Exponents for Small Sample Universal Hypothesis Testing

Huang

Meyn

2013

IEEE Trans. Inform. Theory

View full text Add to dashboard Cite

The small sample universal hypothesis testing problem is investigated in this paper, in which the number of samples is smaller than the number of possible outcomes . The goal of this paper is to find an appropriate criterion to analyze statistical tests in this setting. A suitable model for analysis is the high-dimensional model in which both and increase to infinity, and . A new performance criterion based on large deviations analysis is proposed and it generalizes the classical error exponent applicable for large sample problems (in which ). This generalized error exponent criterion provides insights that are not available from asymptotic consistency or central limit theorem analysis. The following results are established for the uniform null distribution: 1) The best achievable probability of error decays as for some . 2) A class of tests based on separable statistics, including the coincidence-based test, attains the optimal generalized error exponents. 3) Pearson's chi-square test has a zero generalized error exponent and thus its probability of error is asymptotically larger than the optimal test. Index Terms-Bahadur efficiency, Chernoff efficiency, error exponent, hypothesis testing, large alphabet, large deviations, separable statistic, small sample.

show abstract

Universal Neyman–Pearson classification with a partially known hypothesis

Boroumand,

Fàbregas

2024

Information and Inference: A Journal of the IMA

View full text Add to dashboard Cite

We propose a universal classifier for binary Neyman–Pearson classification where the null distribution is known, while only a training sequence is available for the alternative distribution. The proposed classifier interpolates between Hoeffding’s classifier and the likelihood ratio test and attains the same error probability prefactor as the likelihood ratio test, i.e. the same prefactor as if both distributions were known. In addition, such as Hoeffding’s universal hypothesis test, the proposed classifier is shown to attain the optimal error exponent tradeoff attained by the likelihood ratio test whenever the ratio of training to observation samples exceeds a certain value. We propose a lower bound and an upper bound to the optimal training to observation ratio. In addition, we propose a sequential classifier that attains the optimal error exponent tradeoff.

show abstract

Classification of Homogeneous Data With Large Alphabets

Cited by 22 publications

References 36 publications

Weak Convergence Analysis of Asymptotically Optimal Hypothesis Tests

Weak Convergence Analysis of Asymptotically Optimal Hypothesis Tests

Generalized Error Exponents for Small Sample Universal Hypothesis Testing

Universal Neyman–Pearson classification with a partially known hypothesis

Contact Info

Product

Resources

About