Directed Acyclic Graphs (DAGs), which offer systematic representations of causal relationships, have become an established framework for the analysis of causal inference in epidemiology; often being used to determine covariate adjustment sets for minimizing confounding bias. DAGitty is a popular web application for drawing and analysing DAGs. Here we introduce the R package ÔdagittyÕ, which provides access to all of the capabilities of the DAGitty web application within the R platform for statistical computing, and also offers several new functions. We describe how the R package ÔdagittyÕ can be used to: evaluate whether a DAG is consistent with the dataset it is intended to represent; enumerate Ôstatistically equivalentÕ but causally different DAGs; and identify exposure-outcome adjustment sets that are valid for causally different but statistically equivalent DAGs. This functionality enables epidemiologists to detect causal misspecifications in DAGs and make robust inferences that remain valid for a range of different DAGs.
AvailabilityThe R package ÔdagittyÕ is available through the comprehensive R archive network (CRAN) at https://cran.r-project.org/web/packages/dagitty/. The source code is available on github at https://github.com/jtextor/dagitty. The web application ÔDAGittyÕ is free software, licensed under the GNU general public license (GPL) version 2 and is available at http://dagitty.net/.
Background
Directed acyclic graphs (DAGs) are an increasingly popular approach for identifying confounding variables that require conditioning when estimating causal effects. This review examined the use of DAGs in applied health research to inform recommendations for improving their transparency and utility in future research.
Methods
Original health research articles published during 1999–2017 mentioning ‘directed acyclic graphs’ (or similar) or citing DAGitty were identified from Scopus, Web of Science, Medline and Embase. Data were extracted on the reporting of: estimands, DAGs and adjustment sets, alongside the characteristics of each article’s largest DAG.
Results
A total of 234 articles were identified that reported using DAGs. A fifth (n = 48, 21%) reported their target estimand(s) and half (n = 115, 48%) reported the adjustment set(s) implied by their DAG(s).
Two-thirds of the articles (n = 144, 62%) made at least one DAG available. DAGs varied in size but averaged 12 nodes [interquartile range (IQR): 9–16, range: 3–28] and 29 arcs (IQR: 19–42, range: 3–99). The median saturation (i.e. percentage of total possible arcs) was 46% (IQR: 31–67, range: 12–100). 37% (n = 53) of the DAGs included unobserved variables, 17% (n = 25) included ‘super-nodes’ (i.e. nodes containing more than one variable) and 34% (n = 49) were visually arranged so that the constituent arcs flowed in the same direction (e.g. top-to-bottom).
Conclusion
There is substantial variation in the use and reporting of DAGs in applied health research. Although this partly reflects their flexibility, it also highlights some potential areas for improvement. This review hence offers several recommendations to improve the reporting and use of DAGs in future research.
This article discusses three statistical paradoxes that pervade epidemiological research: Simpson's paradox, Lord's paradox, and suppression. These paradoxes have important implications for the interpretation of evidence from observational studies. This article uses hypothetical scenarios to illustrate how the three paradoxes are different manifestations of one phenomenon -the reversal paradox -depending on whether the outcome and explanatory variables are categorical, continuous or a combination of both; this renders the issues and remedies for any one to be similar for all three. Although the three statistical paradoxes occur in different types of variables, they share the same characteristic: the association between two variables can be reversed, diminished, or enhanced when another variable is statistically controlled for. Understanding the concepts and theory behind these paradoxes provides insights into some controversial or contradictory research findings. These paradoxes show that prior knowledge and underlying causal theory play an important role in the statistical modelling of epidemiological data, where incorrect use of statistical models might produce consistent, replicable, yet erroneous results.
SUMMARYThe relation between initial disease status and subsequent change following treatment has attracted great interest in clinical research. However, statisticians have repeatedly warned against correlating=regressing change with baseline due to two methodological concerns known as mathematical coupling and regression to the mean. Oldham's method and Blomqvist's formula are the two most often adopted methods to rectify these problems. The aims of this article are to review brie y the proposed solutions in the statistical and psychological literature, and to clarify the popular misconception that Blomqvist's formula is superior to Oldham's method. We argue that this misconception is due to a failure to recognize that the heterogeneity of individual responses to treatment is a source of regression to the mean in the analysis of the relation between change and initial value. Furthermore, we demonstrate how each method actually answers di erent research questions, and how confusion arises when this is not always understood.
Some researchers have recently questioned the validity of associations between birth weight and health in later life. They argue that these associations might be due in part to inappropriate statistical adjustment for variables on the causal pathway (such as current body size), which creates an artifactual statistical effect known as the "reversal paradox." Computer simulations were conducted for three hypothetical relations between birth weight and adult blood pressure. The authors examined the effect of statistically adjusting for different correlations between current weight and birth weight and between current weight and adult blood pressure to assess their impact on associations between birth weight and blood pressure. When there was no genuine relation between birth weight and blood pressure, adjustment for current weight created an inverse association whose size depended on the magnitude of the positive correlations between current weight and birth weight and between current weight and blood pressure. When there was a genuine inverse relation between birth weight and blood pressure, the association was exaggerated following adjustment for current weight, whereas a positive relation between birth weight and blood pressure could be reversed after adjusting for current weight. Thus, researchers must consider the reversal paradox when adjusting for variables that lie within causal pathways.
15 Kimball A, Hatfield KM, Arons M, et al. Asymptomatic and presymptomatic SARS-CoV-2 infections in residents of a longterm care skilled nursing facility-King County,
Although some evidence was found to support the relation between tooth loss and CVD mortality, causal mechanisms underlying this association remain uncertain.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.