2015
DOI: 10.1016/j.spasta.2015.07.008
|View full text |Cite
|
Sign up to set email alerts
|

Bayesian marked point process modeling for generating fully synthetic public use data with point-referenced geography

Abstract: Many data stewards collect confidential data that include fine geography. When sharing these data with others, data stewards strive to disseminate data that are informative for a wide range of spatial and non-spatial analyses while simultaneously protecting the confidentiality of data subjects' identities and attributes. Typically, data stewards meet this challenge by coarsening the resolution of the released geography and, as needed, perturbing the confidential attributes. When done with high intensity, these… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
38
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
7

Relationship

2
5

Authors

Journals

citations
Cited by 22 publications
(38 citation statements)
references
References 40 publications
(48 reference statements)
0
38
0
Order By: Relevance
“…One approach to do so would be that of Quick et al . (), which uses log‐Gaussian Cox processes (LGCPs) (Møller et al ., ) with an underlying spatial structure to model home addresses within a marked point process model. By doing so, the synthesizer of Quick et al .…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…One approach to do so would be that of Quick et al . (), which uses log‐Gaussian Cox processes (LGCPs) (Møller et al ., ) with an underlying spatial structure to model home addresses within a marked point process model. By doing so, the synthesizer of Quick et al .…”
Section: Discussionmentioning
confidence: 99%
“…For the data in distribution (1), this would not only require generating synthetic n × 1 vectors Y †.l/ , but it would also require constructing a model from which to generate a collection of synthetic locations S †.l/ and any other individual level attributes. As described in Wang and Reiter (2012) and Quick et al (2015), however, approaches for generating fully synthetic point-referenced data sets can be quite computationally burdensome. Thus, in some instances, it may be attractive to take a partially synthetic approach in which only a collection of values or variables are replaced with imputed values (e.g.…”
Section: Methods For Statistical Disclosure Avoidancementioning
confidence: 99%
See 1 more Smart Citation
“…In order to evaluate both the disclosure risk and the utility of the suppressed data, we consider the use of synthetic data (e.g. Little, ; Kennickell, ; Reiter, ; Quick et al, ). Specifically, we let θ = ( β 0 , Z T , ϕ T , σ 2 , τ 2 ) T and generate synthetic values, Yi, using the posterior predictive distribution, Yi|Y=Yi|θ,Y×θ|Ydθ=PoisYi|niexpβ0+Zi+ϕi×β0,Z,ϕ,σ2,τ2|Ydβ0dZdϕdσ2dτ2 based on the hierarchical model in .…”
Section: Methodsmentioning
confidence: 99%
“…For example, a variance-covariance matrix might be used to generate new data that serves as a proxy for the original data. Research into 'spatial data synthesizers', such as Quick et al (2015), would thus be enormously beneficial. The US Census Bureau has started publishing synthetic individual level data based on highly sensitive administrative data from agencies like the IRS and the Social Security Administration (Bureau 2014).…”
Section: Reproducible Publications Using Workflow Modelsmentioning
confidence: 99%