2020
DOI: 10.1111/insr.12434
|View full text |Cite
|
Sign up to set email alerts
|

Data Integration by Combining Big Data and Survey Sample Data for Finite Population Inference

Abstract: Summary The statistical challenges in using big data for making valid statistical inference in the finite population have been well documented in literature. These challenges are due primarily to statistical bias arising from under‐coverage in the big data source to represent the population of interest and measurement errors in the variables available in the data set. By stratifying the population into a big data stratum and a missing data stratum, we can estimate the missing data stratum by using a fully resp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
34
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 30 publications
(37 citation statements)
references
References 32 publications
(32 reference statements)
0
34
0
Order By: Relevance
“…These data sets contain the layers of raster and geospatial data. Kim and Tam (2020) have proposed a data integration estimator [17]. This is a classification technique with non-parametric and overlapping units which recognizes and corrects misclassification errors.…”
Section: Review Of Agriculture Sectormentioning
confidence: 99%
“…These data sets contain the layers of raster and geospatial data. Kim and Tam (2020) have proposed a data integration estimator [17]. This is a classification technique with non-parametric and overlapping units which recognizes and corrects misclassification errors.…”
Section: Review Of Agriculture Sectormentioning
confidence: 99%
“…Since the sizes N B and N C = N − N B are known, a more efficient post-stratified estimator is given by b Kim and Tam (2018) showed that with a simple random sample A, the poststratified estimator achieves a large reduction in the design variance compared to the design unbiased estimator b Y ¼ ∑ i∈A d i y i based only on the probability sample A. In particular, if the sampling fraction f = n/N is small and the population variance σ 2 y and the variance of the population units not belonging to B, denoted…”
Section: Study Variable Observed In Both Samplesmentioning
confidence: 99%
“…where N B , N C and Y B are known. The resulting calibration estimator ∑ i ∈ A w i y i is identical to the post-stratified estimator b Y P (Kim and Tam 2018). However, the main advantage of the calibration approach is that it permits the inclusion of other calibration constraints, if available.…”
Section: Between the Design Weights D I And The Calibration Weights W I Subject To Calibration Constraintsmentioning
confidence: 99%
See 2 more Smart Citations