We show that propensity score matching (PSM), an enormously popular method of preprocessing data for causal inference, often accomplishes the opposite of its intended goal—thus increasing imbalance, inefficiency, model dependence, and bias. The weakness of PSM comes from its attempts to approximate a completely randomized experiment, rather than, as with other matching methods, a more efficient fully blocked randomized experiment. PSM is thus uniquely blind to the often large portion of imbalance that can be eliminated by approximating full blocking with other matching methods. Moreover, in data balanced enough to approximate complete randomization, either to begin with or after pruning some observations, PSM approximates random matching which, we show, increases imbalance even relative to the original data. Although these results suggest researchers replace PSM with one of the other available matching methods, propensity scores have other productive uses.
Recent advances in research tools for the systematic analysis of textual data are enabling exciting new research throughout the social sciences. For comparative politics, scholars who are often interested in non-English and possibly multilingual textual datasets, these advances may be difficult to access. This article discusses practical issues that arise in the processing, management, translation, and analysis of textual data with a particular focus on how procedures differ across languages. These procedures are combined in two applied examples of automated text analysis using the recently introduced Structural Topic Model. We also show how the model can be used to analyze data that have been translated into a single language via machine translation tools. All the methods we describe here are implemented in open-source software packages available from the authors.
In this study we resolve part of the confusion over how foreign aid affects armed conflict. We argue that aid shocks-severe decreases in aid revenues-inadvertently shift the domestic balance of power and potentially induce violence. During aid shocks, potential rebels gain bargaining strength vis-à-vis the government. To appease the rebels, the government must promise future resource transfers, but the government has no incentive to continue its promised transfers if the aid shock proves to be temporary. With the government unable to credibly commit to future resource transfers, violence breaks out. Using AidData's comprehensive dataset of bilateral and multilateral aid from 1981 to 2005, we evaluate the effects of foreign aid on violent armed conflict. In addition to rare-event logit analysis, we employ matching methods to account for the possibility that aid donors anticipate conflict. The results show that negative aid shocks significantly increase the probability of armed conflict onset.
This article provides theoretical and empirical solutions to two connected puzzles in the study of foreign aid and human rights: Do foreign aid donors use aid sanctions to punish repressive states, and if so, why? I show that donors impose aid sanctions selectively. Aid sanctions typically occur when repressive states do not have close political ties to aid donors, when violations have negative consequences for donors and when violations are widely publicized. Using a data set of bilateral foreign aid to 118 developing countries between 1981 and 2004, I find that variation in these factors largely accounts for the differing aid sanctions that result from objectively similar rights violations by the governments of developing countries.
We propose a simplified approach to matching for causal inference that simultaneously optimizes balance (similarity between the treated and control groups) and matched sample size. Existing approaches either fix the matched sample size and maximize balance or fix balance and maximize sample size, leaving analysts to settle for suboptimal solutions or attempt manual optimization by iteratively tweaking their matching method and rechecking balance. To jointly maximize balance and sample size, we introduce the matching frontier, the set of matching solutions with maximum possible balance for each sample size. Rather than iterating, researchers can choose matching solutions from the frontier for analysis in one step. We derive fast algorithms that calculate the matching frontier for several commonly used balance metrics. We demonstrate this approach with analyses of the effect of sex on judging and job training programs that show how the methods we introduce can extract new knowledge from existing data sets. comments on an early version were especially helpful. Software to implement the methods proposed in this article can be found at http://projects.iq.harvard.edu/frontier.
We identify situations in which conditioning on text can address confounding in observational studies. We argue that a matching approach is particularly well-suited to this task, but existing matching methods are ill-equipped to handle high-dimensional text data. Our proposed solution is to estimate a low-dimensional summary of the text and condition on this summary via matching. We propose a method of text matching, topical inverse regression matching, that allows the analyst to match both on the topical content of confounding documents and the probability that each of these documents is treated. We validate our approach and illustrate the importance of conditioning on text to address confounding with two applications: the effect of perceptions of author gender on citation counts in the international relations literature and the effects of censorship on Chinese social media users. Verification Materials: The materials required to verify the computational reproducibility of the results, procedures, and analyses in this article are available on the American Journal of Political Science Dataverse within the Harvard Dataverse Network, at: https://doi.org/10.7910/DVN/HTMX3K. S ocial media users in China are censored every day, but it is largely unknown how the experience of being censored affects their future online experience. Are social media users who are censored for the first time flagged by censors for increased scrutiny in the future? Is censorship "targeted" and "customized" toward specific users? Do social media users avoid writing after being censored? Do they continue to write on sensitive topics or do they avoid them? Experimentally manipulating censorship would allow us to make credible causal inferences about the effects of experiencing censorship, but this is impractical
This article shows how statistical matching methods can be used to select “most similar” cases for qualitative analysis. I first offer a methodological justification for research designs based on selecting most similar cases. I then discuss the applicability of existing matching methods to the task of selecting most similar cases and propose adaptations to meet the unique requirements of qualitative analysis. Through several applications, I show that matching methods have advantages over traditional selection in “most similar” case designs: They ensure that most similar cases are in fact most similar; they make scope conditions, assumptions, and measurement explicit; and they make case selection transparent and replicable.
We highlight common problems in the application of random treatment assignment in large-scale program evaluation. Random assignment is the defining feature of modern experimental design, yet errors in design, implementation, and analysis often result in real-world applications not benefiting from its advantages. The errors discussed here cover the control of variability, levels of randomization, size of treatment arms, and power to detect causal effects, as well as the many problems that commonly lead to post-treatment bias. We illustrate these issues by identifying numerous serious errors in the Medicare Health Support evaluation and offering recommendations to improve the design and analysis of this and other large-scale randomized experiments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.