2020
DOI: 10.1007/s13171-020-00224-1
|View full text |Cite
|
Sign up to set email alerts
|

Design-Unbiased Statistical Learning in Survey Sampling

Abstract: Design-consistent model-assisted estimation has become the standard practice in survey sampling. However, design consistency remains to be established for many machine-learning techniques that can potentially be very powerful assisting models. We propose a subsampling Rao-Blackwell method, and develop a statistical learning theory for exactly design-unbiased estimation with the help of linear or non-linear prediction models. Our approach makes use of classic ideas from Statistical Science as well as the rapidl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 18 publications
(27 reference statements)
0
2
0
Order By: Relevance
“…However, trueθ^false(1false)$$ {\hat{\theta}}^{(1)} $$ is only based on one random split of sr=s1s2$$ {s}_r={s}_1\cup {s}_2 $$, which leads to additional variance due to learning from s1$$ {s}_1 $$ instead of sr$$ {s}_r $$. As proposed by Sanguiao‐Sande and Zhang (2021), we can reduce the variance of trueθ^false(1false)$$ {\hat{\theta}}^{(1)} $$ by applying the Monte Carlo Rao–Blackwell (RB) method. Then, our proposed estimator is given by alignleftalign-1θ^SRB=1Kk=1Kθ^(k),$$ {\hat{\theta}}_{\mathrm{SRB}}=\frac{1}{K}\sum \limits_{k=1}^K{\hat{\theta}}^{(k)},\kern0.5em $$ where trueθ^false(kfalse)$$ {\hat{\theta}}^{(k)} $$ is the estimator () calculated from the k th Monte Carlo subsamples, false(s1false(kfalse),s2false(kfalse)false)$$ \left({s}_1^{(k)},{s}_2^{(k)}\right) $$ such that sr=s1false(kfalse)<...…”
Section: Proposed Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…However, trueθ^false(1false)$$ {\hat{\theta}}^{(1)} $$ is only based on one random split of sr=s1s2$$ {s}_r={s}_1\cup {s}_2 $$, which leads to additional variance due to learning from s1$$ {s}_1 $$ instead of sr$$ {s}_r $$. As proposed by Sanguiao‐Sande and Zhang (2021), we can reduce the variance of trueθ^false(1false)$$ {\hat{\theta}}^{(1)} $$ by applying the Monte Carlo Rao–Blackwell (RB) method. Then, our proposed estimator is given by alignleftalign-1θ^SRB=1Kk=1Kθ^(k),$$ {\hat{\theta}}_{\mathrm{SRB}}=\frac{1}{K}\sum \limits_{k=1}^K{\hat{\theta}}^{(k)},\kern0.5em $$ where trueθ^false(kfalse)$$ {\hat{\theta}}^{(k)} $$ is the estimator () calculated from the k th Monte Carlo subsamples, false(s1false(kfalse),s2false(kfalse)false)$$ \left({s}_1^{(k)},{s}_2^{(k)}\right) $$ such that sr=s1false(kfalse)<...…”
Section: Proposed Methodsmentioning
confidence: 99%
“…It is similar in spirit to the super learner proposed by Van der Laan et al (2007), which aims to improve prediction by creating a weighted combination of many candidate outcome learners, but in a different manner to ours. While our use of the cell response model is a robust extension of the randomization‐based approach of unbiased statistical learning proposed by Sanguiao‐Sande and Zhang (2021), which applies a single assisting outcome model to the complete sample observations by the subsampling Rao–Blackwell method.…”
Section: Introductionmentioning
confidence: 99%