Statistical agencies and other institutions collect data under the promise to protect the confidentiality of respondents. When releasing microdata samples, the risk that records can be identified must be assessed. To this aim, a widely adopted approach is to isolate categorical variables key to the identification and analyze multi-way contingency tables of such variables. Common disclosure risk measures focus on sample unique cells in these tables and adopt parametric log-linear models as the standard statistical tools for the problem. Such models often have to deal with large and extremely sparse tables that pose a number of challenges to risk estimation. This paper proposes to overcome these problems by studying nonparametric alternatives based on Dirichlet process random effects. The main finding is that the inclusion of such random effects allows us to reduce considerably the number of fixed effects required to achieve reliable risk estimates. This is studied on applications to real data, suggesting, in particular, that our mixed models with main effects only produce roughly equivalent estimates compared to the all two-way interactions models, and are effective in defusing potential shortcomings of traditional log-linear models. This paper adopts a fully Bayesian approach that accounts for all sources of uncertainty, including that about the population frequencies, and supplies unconditional (posterior) variances and credible intervals.
When microdata files for research are released, it is possible that external users may attempt to breach confidentiality. For this reason most National Statistical Institutes apply some form of disclosure risk assessment and data protection. Risk assessment first requires a measure of disclosure risk to be defined. In this paper we build on previous work byBenedetti and Franconi (1998) to define a Bayesian hierarchical model for risk estimation. We follow a superpopulation approach similar to Bethlehem et al. (1990) and Rinott (2003). For each combination of values of the key variables we derive the posterior distribution of the population frequency given the observed sample frequency. Knowledge of this posterior distribution enables us to obtain suitable summaries that can be used to estimate the risk of disclosure. One such summary is the mean of the reciprocal of the population frequency or Benedetti-Franconi risk, but we also investigate others such as the mode. We apply our approach to an artificial sample of the Italian 1991 Census data, drawn by means of a widely used sampling scheme. We report on results of this application and document the computational difficulties that we encountered. The risk estimates that we obtain are sensible, but suggest possible improvements and modifications to our methodology. We discuss these together with potential alternative strategies
Besides health and socio-economic status, the social relationships maintained during elderly play an important role in shaping the living conditions at older ages. In this part of life, the family represents the major framework in which interpersonal relationships are experienced. With population ageing, the family increasingly ensures elderly care, especially where targeted public policies are lacking, and its protective effect on survival is foreseen. In this paper we investigate the complex role of family in mortality and contribute to the debate on the ideal living condition for elderly people, exploiting a nationwide integrated survey on private and collective households in France. Following a cohort of 16,263 individuals aged 55, we investigate the effect of family relations on survival. We question whether the lack of family care connected with the absence of active relationships with family members may be at least partly compensated by institutional settings. Estimates of life expectancy show that at the age of 60, people living in institutions live on average 10 years less that those living in private households, the gap decreasing with age. Cox proportional hazards models show a protective role of children on mortality. Having no children seems to be associated with a lower risk of death, but the effect is significant only for those declaring rare or no contacts with their children. Survival analysis also suggests that institutional living arrangement may be protective for the most fragile individuals, namely the severely disabled, isolated persons.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.