Semiparametric Inference for Nonmonotone Missing-Not-at-Random Data: The No Self-Censoring Model

Malinsky, Daniel; Shpitser, Ilya; Tchetgen, Eric J. Tchetgen

doi:10.1080/01621459.2020.1862669

Cited by 16 publications

(31 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Recent years have witnessed the development of several nonignorable models that one can readily estimate under some simplifying assumptions (Vansteelandt et al, 2007;Zhou et al, 2010;Sadinle and Reiter, 2017;Mohan and Pearl, 2018;Tchetgen et al, 2018;Bhattacharya et al, 2020;Nabi et al, 2020;Chen, 2020b;Malinsky et al, 2021). Although these developments are important, we believe that they do not obviate the types of sensitivity analysis that we and others have proposed.…”

Section: Discussionmentioning

confidence: 86%

Analysis of local sensitivity to nonignorability with missing outcomes and predictors

Chen

Heitjan

2021

Biometrics

View full text Add to dashboard Cite

The ISNI (index of sensitivity to local nonignorability) method quantifies local sensitivity of parametric inferences to nonignorable missingness in an outcome variable. Here we extend ISNI to the situations where both outcomes and predictors can be missing and where the missingness mechanism can be either parametric or semi‐parametric. We define the quantity MinNI (minimum nonignorability) to be an approximation to the norm of the smallest value of the transformed nonignorability that gives a nonnegligible displacement of the estimate of the parameter of interest. We illustrate our method in a complete data set from which we synthetically delete observations according to various patterns. We then apply the method to real‐data examples involving the normal linear model and conditional logistic regression.

show abstract

Section: Discussionmentioning

confidence: 86%

Analysis of local sensitivity to nonignorability with missing outcomes and predictors

Chen

Heitjan

2021

Biometrics

View full text Add to dashboard Cite

show abstract

“…If the terms in the proposed likelihoods are kept unrestricted, aside from necessary restrictions imposed by Lemma 1, the result yields a useful view on the tangent space of the corresponding Markov model, which is useful for deriving estimators based on influence functions that attain the semi-parametric efficiency bound. Indeed, a special case of the Chen decomposition for a particular class of Markov random fields has already been used to derive an efficient influence function in a missing data model [13].…”

Section: Discussionmentioning

confidence: 99%

The Lauritzen-Chen Likelihood For Graphical Models

Shpitser¹

2022

Preprint

Self Cite

View full text Add to dashboard Cite

Graphical models such as Markov random fields (MRFs) that are associated with undirected graphs, and Bayesian networks (BNs) that are associated with directed acyclic graphs, have proven to be a very popular approach for reasoning under uncertainty, prediction problems and causal inference.Parametric MRF likelihoods are well-studied for Gaussian and categorical data. However, in more complicated parametric and semi-parametric settings, likelihoods specified via clique potential functions are generally not known to be congenial or non-redundant. Congenial and non-redundant DAG likelihoods are far simpler to specify in both parametric and semi-parametric settings by modeling Markov factors in the DAG factorization. However, DAG likelihoods specified in this way are not guaranteed to coincide in distinct DAGs within the same Markov equivalence class. This complicates likelihoods based model selection procedures for DAGs by "sneaking in" potentially unwarranted assumptions about edge orientations.In this paper we link a density function decomposition due to Chen with the clique factorization of MRFs described by Lauritzen to provide a general likelihood for MRF models. The proposed likelihood is composed of variationally independent, and non-redundant closed form functionals of the observed data distribution, and is sufficiently general to apply to arbitrary parametric and semi-parametric models. We use an extension of our developments to give a general likelihood for DAG models that is guaranteed to coincide for all members of a Markov equivalence class. Our results have direct applications for model selection and semi-parametric inference. PreliminariesWe first introduce necessary graphical modeling preliminaries. Graphs are assumed to have a vertex set V , and we will restriction attention to positive distributions. Given any graph G, for S ⊆ V , an induced subgraph G S of G is defined as the graph with a vertex set S and all edges in G connecting elements in S.Given an undirected graph (UG) G, a clique C is a (possibly empty) subset of vertices in V that are pairwise connected in G. The set of all cliques in G is denoted by C(G), while the set of all maximal cliques is denoted by C(G). Note that, in general, neither, where φ C are potential functions which map values of C to real numbers. Potential functions are not necessarily normalized probabilities. Equivalently,where Z is a normalizing constant. If we restrict attention to positive distributions, an MRF model may be equivalently defined as the set of distributions p( v) that satisfy either the global or pairwise Markov property for G. The global Markov property for p( v) and a UG G states that for any disjoint subsets A, B, C of V whenever all paths from A to B in G are intercepted by C, then A ⊥ ⊥ B| C in p( v). The pairwise Markov property for p( v) and G states that for any vertex pair A, B non-adjacent in G, A ⊥ ⊥ B| V \ {A, B} in p( v).A joint distribution p( v) is in the Bayesian network (BN) model of a directed acyclic graph (DAG), where pa G (V ) are...

show abstract

“…Of these methods, the large majority adopt the missing at random (MAR) assumption (Rubin, 1976), in which the probability that data are missing is assumed to depend only on observed data. While methods have been proposed towards estimation of a range of parameters under alternative sets of assumptions (Miao and Tchetgen Tchetgen, 2016;Malinsky et al, 2020), in practice the most commonly used methods assume MAR, and implement inverse-probability weighting (IPW) (Seaman and White, 2013), multiple imputation (Rubin, 2004), or doubly-robust methods (Robins et al, 1994;Tsiatis, 2007). In settings where investigators believe that MAR may not plausibly hold, the usual recommended course of action is to conduct a sensitivity analysis (e.g., see Robins et al (2000)), or to estimate bounds on the parameters of interest (e.g., see Manski (1990)).…”

Section: Introductionmentioning

confidence: 99%

Double sampling and semiparametric methods for informatively missing data

Levis¹,

Mukherjee²,

Wang³

et al. 2022

Preprint

View full text Add to dashboard Cite

Missing data arise almost ubiquitously in applied settings, and can pose a substantial threat to the validity of statistical analyses. In the context of comparative effectiveness research, such as in large observational databases (e.g., those derived from electronic health records), outcomes may be missing not at random with respect to measured covariates. In this setting, we propose a double sampling method, in which outcomes are obtained via intensive follow-up on a subsample of subjects for whom data were initially missing. We describe assumptions under which the joint distribution of confounders, treatment, and outcome is identified under this design, and derive efficient estimators of the average treatment effect under a nonparametric model, as well as a model assuming outcomes were initially missing at random. We compare these in simulations to an approach that adaptively selects an estimator based on evidence of violation of the missing at random assumption. We also show that the proposed double sampling design can be extended to handle arbitrary coarsening mechanisms, and derive consistent, asymptotically normal, and nonparametric efficient estimators of any smooth full data functional of interest, and prove that these estimators often are multiply robust.

show abstract

Semiparametric Inference for Nonmonotone Missing-Not-at-Random Data: The No Self-Censoring Model

Cited by 16 publications

References 27 publications

Analysis of local sensitivity to nonignorability with missing outcomes and predictors

Analysis of local sensitivity to nonignorability with missing outcomes and predictors

The Lauritzen-Chen Likelihood For Graphical Models

Double sampling and semiparametric methods for informatively missing data

Contact Info

Product

Resources

About