This paper explores the usefulness of the multivariate skew-normal distribution in the context of graphical models. A slight extension of the family recently discussed by Azzalini & Dalla Valle (1996) and Azzalini & Capitanio (1999) is described, the main motivation being the additional property of closure under conditioning. After considerations of the main probabilistic features, the focus of the paper is on the construction of conditional independence graphs for skew-normal variables. Necessary and sufficient conditions for conditional independence are stated, and the admissible structures of a graph under restriction on univariate marginal distribution are studied. Finally, parameter estimation is considered. It is shown how the factorization of the likelihood function according to a graph can be rearranged in order to obtain a parameter based factorization
A fundamental research question is how much a variation in a covariate influences a binary response variable in a logistic regression model, both directly or through mediators. We derive the exact formula linking the parameters of marginal and conditional regression models with binary mediators when no conditional independence assumptions can be made. The formula has the appealing property of being the sum of terms that vanish whenever parameters of the conditional models vanish, thereby recovering well-known results as particular cases. It also permits to quantify the distortion induced by omission of some relevant covariates, opening the way to sensitivity analysis. Also in this case, as the parameters of the conditional models are multiplied by terms that are always positive or bounded, the formula may be used to construct reasonable bounds on the parameters of interest. We assume that, conditionally on a set of covariates, the data-generating process can be represented by a Directed Acyclic Graph. We also show how the results here presented lead to the extension of path analysis to a system of binary random variables.
Identifiability of parameters is an essential property for a statistical model to be useful in most settings. However, establishing parameter identifiability for Bayesian networks with hidden variables remains challenging. In the context of finite state spaces, we give algebraic arguments establishing identifiability of some special models on small directed acyclic graphs (DAGs). We also establish that, for fixed state spaces, generic identifiability of parameters depends only on the Markov equivalence class of the DAG. To illustrate the use of these results, we investigate identifiability for all binary Bayesian networks with up to five variables, one of which is hidden and parental to all observable ones. Surprisingly, some of these models have parameterizations that are generically 4-to-one, and not 2-to-one as label swapping of the hidden states would suggest. This leads to interesting conflict in interpreting causal effects.
Summary. We present a model to estimate the size of an unknown population from a number of lists that applies when the assumptions of (a) homogeneity of capture probabilities of individuals and (b) marginal independence of lists are violated. This situation typically occurs in epidemiological studies, where the heterogeneity of individuals is severe and researchers cannot control the independence between sources of ascertainment. We discuss the situation when categorical covariates are available and the interest is not only in the total undercount, but also in the undercount within each stratum resulting from the crossclassification of the covariates. We also present several techniques for determining confidence intervals of the undercount within each stratum using the profile log likelihood, thereby extending the work of Cormack (1992, Biometrics 48, 567-576).
When estimating regression models with missing outcomes, scientists usually have to rely either on a missing at random assumption (missing mechanism is independent from the outcome given the observed variables) or on exclusion restrictions (some of the covariates affecting the missingness mechanism do not affect the outcome). Both these hypotheses are controversial in applications since they are typically not testable from the data. The alternative, which we pursue here, is to derive identification sets (instead of point identification) for the parameters of interest when allowing for a missing not at random mechanism. The non-ignorability of this mechanism is quantified with a parameter. When the latter can be bounded with a priori information, a bounded identification set follows. Our approach allows the outcome to be continuous and unbounded and relax distributional assumptions. Estimation of the identification sets can be performed via ordinary least squares and sampling variability can be incorporated yielding uncertainty intervals achieving a coverage of at least (1 − α) probability. Our work is motivated by a study on predictors of body mass index (BMI) change in middle age men allowing us to identify possible predictors of BMI change even when assuming little on the missing mechanism.
BackgroundMonitoring the incidence of bacterial meningitis is important to plan and evaluate preventive polices. The study's aim was to estimate the incidence of bacterial meningitis by aetiological agent in the period 2001–2005, in Lazio Italy (5.3 mln inhabitants).MethodsData collected from four sources – hospital surveillance of bacterial meningitis, laboratory information system, the mandatory infectious diseases notifications, and hospital information system – were combined into a single archive.Results944 cases were reported, 89% were classified as community acquired. S. pneumoniae was the most frequent aetiological agent in Lazio, followed by N. meningitis. Incidence of H. influenzae decreased during the period. 17% of the cases had an unknown aetiology and 13% unspecified bacteria. The overall incidence was 3.7/100,000. Children under 1 year were most affected (50.3/100.000), followed by 1–4 year olds (12.5/100,000). The percentage of meningitis due to aetiological agents included in the vaccine targets, not considering age, is 31%. Streptococcus spp. was the primary cause of meningitis in the first three months of life. The capture-recapture model estimated underreporting at 17.2% of the overall incidence.ConclusionVaccine policies should be planned and monitored based on these results. The integrated surveillance system allowed us to observe a drop in H. influenzae b meningitis incidence consequent to the implementation of a mass vaccination of newborns.
Summary Recent work (Seaman et al., ; Mealli & Rubin, ) attempts to clarify the not always well‐understood difference between realised and everywhere definitions of missing at random (MAR) and missing completely at random. Another branch of the literature (Mohan et al., ; Pearl & Mohan, ) exploits always‐observed covariates to give variable‐based definitions of MAR and missing completely at random. In this paper, we develop a unified taxonomy encompassing all approaches. In this taxonomy, the new concept of ‘complementary MAR’ is introduced, and its relationship with the concept of data observed at random is discussed. All relationships among these definitions are analysed and represented graphically. Conditional independence, both at the random variable and at the event level, is the formal language we adopt to connect all these definitions. Our paper covers both the univariate and the multivariate case, where attention is paid to monotone missingness and to the concept of sequential MAR. Specifically, for monotone missingness, we propose a sequential MAR definition that might be more appropriate than both everywhere and variable‐based MAR to model dropout in certain contexts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.