2020
DOI: 10.48550/arxiv.2012.02901
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Near-Optimal Procedures for Model Discrimination with Non-Disclosure Properties

Abstract: Let θ 0 , θ 1 ∈ R d be the population risk minimizers associated to some loss : R d × Z → R and two distributions P 0 , P 1 on Z. Our work is motivated by the following question: Given i.i.d. samples from P 0 and P 1 , what sample sizes are sufficient and necessary to distinguish between the two hypotheses θ * = θ 0 and θ * = θ 1 for given θ * ∈ {θ 0 , θ 1 }?Making the first steps towards answering this question in full generality, we first consider the case of a well-specified linear model with squared loss. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 26 publications
0
1
0
Order By: Relevance
“…Since this model has bounded data, our analysis could be applied in that setting, however it concerns separation in ℓ 2 distance (the separation in ℓ 1 distance exhibits considerably more involved behavior). Ostrovskii et al (2020) consider a different type of two-sample testing problem, in a regression context, where the goal is to determine which one of the two distributions has a given (known to the user) regression vector. They give a sharp bound on the minimum separation distance between the two regression vectors including the role of the dimension, also exhibiting a difference with estimation rates.…”
Section: Relation To "Modern" and High-dimensional Statisticsmentioning
confidence: 99%
“…Since this model has bounded data, our analysis could be applied in that setting, however it concerns separation in ℓ 2 distance (the separation in ℓ 1 distance exhibits considerably more involved behavior). Ostrovskii et al (2020) consider a different type of two-sample testing problem, in a regression context, where the goal is to determine which one of the two distributions has a given (known to the user) regression vector. They give a sharp bound on the minimum separation distance between the two regression vectors including the role of the dimension, also exhibiting a difference with estimation rates.…”
Section: Relation To "Modern" and High-dimensional Statisticsmentioning
confidence: 99%