2019
DOI: 10.48550/arxiv.1906.03794
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

The Broad Optimality of Profile Maximum Likelihood

Abstract: We study three fundamental statistical-learning problems: distribution estimation, property estimation, and property testing. We establish the profile maximum likelihood (PML) estimator as the first unified sample-optimal approach to a wide range of learning tasks. In particular, for every alphabet size k and desired accuracy ε: Distribution estimation Under 1 distance, PML yields optimal Θ(k/(ε 2 log k)) sample complexity for sorted-distribution estimation, and a PML-based estimator empirically outperforms th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 7 publications
(14 citation statements)
references
References 61 publications
0
14
0
Order By: Relevance
“…Correct asymptotic For most of the properties considered in the paper, even the naive empiricalfrequency estimator is sample-optimal in the large-sample regime (termed "simple regime" in [37]) where the number of samples n far exceeds the alphabet size k. The interesting regime, addressed in numerous recent publications [17,18,21,20,23,34,36,38], is where n and k are comparable, e.g., differing by at most a logarithmic factor. In this range, n is sufficiently small that sophisticated techniques can help, yet not too small that nothing can be estimated.…”
Section: Implications and New Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Correct asymptotic For most of the properties considered in the paper, even the naive empiricalfrequency estimator is sample-optimal in the large-sample regime (termed "simple regime" in [37]) where the number of samples n far exceeds the alphabet size k. The interesting regime, addressed in numerous recent publications [17,18,21,20,23,34,36,38], is where n and k are comparable, e.g., differing by at most a logarithmic factor. In this range, n is sufficiently small that sophisticated techniques can help, yet not too small that nothing can be estimated.…”
Section: Implications and New Resultsmentioning
confidence: 99%
“…Recently, the work of [2] showed that the profile maximum likelihood (PML) estimator [28], an estimator that finds a distribution maximizing the probability of observing the multiset of empirical frequencies, is sample-optimal for estimating entropy, distance to uniformity, and normalized support size and coverage. After the initial submission of the current work, paper [20] showed that the PML approach and its near-linear-time computable variant [5] are sample-optimal for any property that is symmetric, additive, and appropriately Lipschitz, including the four properties just mentioned. This establishes the PML estimator as the first universally sample-optimal plug-in approach for estimating symmetric properties.…”
Section: Existing Methodsmentioning
confidence: 99%
“…Note the dependency on in the above theorem and the approximation factor in Theorem 3.4 are strictly better than [CSS19], which is another efficient PML based approach for universal symmetric property estimation; [CSS19] works when the error parameter ≥ 1 n 0.166 . Recent work [HO19], further gives the broad optimality of approximate PML. [HO19] shows optimality of approximate PML distribution based estimator for other symmetric properties, such as, sorted distribution estimation (under 1 distance), α-Renyi entropy for non-integer α > 3/4, and other broad class of additive properties that are Lipschitz.…”
Section: Theorem 34 (Exp (mentioning
confidence: 99%
“…Recent work [HO19], further gives the broad optimality of approximate PML. [HO19] shows optimality of approximate PML distribution based estimator for other symmetric properties, such as, sorted distribution estimation (under 1 distance), α-Renyi entropy for non-integer α > 3/4, and other broad class of additive properties that are Lipschitz. [HO19] also provides a PML-based tester to test whether an unknown distribution is ≥ far from a given distribution in 1 distance and achieves the optimal sample complexity up to logarithmic factors.…”
Section: Theorem 34 (Exp (mentioning
confidence: 99%
“…Distribution property estimation literature most related to our work include entropy estimation [19,20,21,22,23,24,25,26], support size estimation [21,23,27], Rényi entropy estimation [28,29,30], support coverage estimation [31,32], and divergence estimation [33,34].…”
Section: Introductionmentioning
confidence: 99%