Gary King scite author profile

Although published works rarely include causal estimates from more than a few model specifications, authors usually choose the presented estimates from numerous trial runs readers never see. Given the often large variation in estimates across choices of control variables, functional forms, and other modeling assumptions, how can researchers ensure that the few estimates presented are accurate or representative? How do readers know that publications are not merely demonstrations that it is possible to find a specification that fits the author's favorite hypothesis? And how do we evaluate or even define statistical properties like unbiasedness or mean squared error when no unique model or estimator even exists? Matching methods, which offer the promise of causal inference with fewer assumptions, constitute one possible way forward, but crucial results in this fast-growing methodological literature are often grossly misinterpreted. We explain how to avoid these

show abstract

MatchIt: Nonparametric Preprocessing for Parametric Causal Inference

Ho¹,

Imai²,

et al. 2011

View full text Add to dashboard Cite

MatchIt implements the suggestions of Ho, Imai, King, and Stuart (2007) for improving parametric statistical models by preprocessing data with nonparametric matching methods. MatchIt implements a wide range of sophisticated matching methods, making it possible to greatly reduce the dependence of causal inferences on hard-to-justify, but commonly made, statistical modeling assumptions. The software also easily fits into existing research practices since, after preprocessing data with MatchIt, researchers can use whatever parametric model they would have used without MatchIt, but produce inferences with substantially more robustness and less sensitivity to modeling assumptions. MatchIt is an R program, and also works seamlessly with Zelig.

show abstract

Causal Inference without Balance Checking: Coarsened Exact Matching

Iacus¹,

King

Porro³

2012

Polit. anal.

2,491

1,869

View full text Add to dashboard Cite

We discuss a method for improving causal inferences called ''Coarsened Exact Matching'' (CEM), and the new ''Monotonic Imbalance Bounding'' (MIB) class of matching methods from which CEM is derived. We summarize what is known about CEM and MIB, derive and illustrate several new desirable statistical properties of CEM, and then propose a variety of useful extensions. We show that CEM possesses a wide range of statistical properties not available in most other matching methods but is at the same time exceptionally easy to comprehend and use. We focus on the connection between theoretical properties and practical applications. We also make available easy-to-use open source software for R, Stata, and SPSS that implement all our suggestions.

show abstract

Logistic Regression in Rare Events Data

Tomz¹,

King²,

Zeng³

2003

J. Stat. Soft.

1,220

1,587

View full text Add to dashboard Cite

Computational Social Science

Lazer

Pentland

Adamic

et al. 2009

Science

2,685

1,525

View full text Add to dashboard Cite

We live life in the network. When we wake up in the morning, we check our e-mail, make a quick phone call, walk outside (our movements captured by a high definition video camera), get on the bus (swiping our RFID mass transit cards) or drive (using a transponder to zip through the tolls). We arrive at the airport, making sure to purchase a sandwich with a credit card before boarding the plane, and check our BlackBerries shortly before takeoff. Or we visit the doctor or the car mechanic, generating digital records of what our medical or automotive problems are. We post blog entries confiding to the world our thoughts and feelings, or maintain personal NIH Public Access

show abstract

The Parable of Google Flu: Traps in Big Data Analysis

Lazer

Kennedy

King

et al. 2014

Science

2,021

1,505

View full text Add to dashboard Cite

show abstract

AmeliaII: A Program for Missing Data

Honaker¹,

King²,

Blackwell³

2011

J. Stat. Soft.

1,940

1,536

View full text Add to dashboard Cite

Amelia II is a complete R package for multiple imputation of missing data. The package implements a new expectation-maximization with bootstrapping algorithm that works faster, with larger numbers of variables, and is far easier to use, than various Markov chain Monte Carlo approaches, but gives essentially the same answers. The program also improves imputation models by allowing researchers to put Bayesian priors on individual cell values, thereby including a great deal of potentially valuable and extensive information. It also includes features to accurately impute cross-sectional datasets, individual time series, or sets of time series for different cross-sections. A full set of graphical diagnostics are also available. The program is easy to use, and the simplicity of the algorithm makes it far more robust; both a simple command line and extensive graphical user interface are included.

show abstract

Making the Most of Statistical Analyses: Improving Interpretation and Presentation

King¹,

Tomz²,

Wittenberg³

2000

American Journal of Political Science

2,501

1,004

View full text Add to dashboard Cite

 W e show that social scientists often do not take full advantage of the information available in their statistical results and thus miss opportunities to present quantities that could shed the greatest light on their research questions. In this article we suggest an approach, built on the technique of statistical simulation, to extract the currently overlooked information and present it in a reader-friendly manner. More specifically, we show how to convert the raw results of any statistical procedure into expressions that (1) convey numerically precise estimates of the quantities of greatest substantive interest, (2) include reasonable measures of uncertainty about those estimates, and (3) require little specialized knowledge to understand.The following simple statement satisfies our criteria: "Other things being equal, an additional year of education would increase your annual income by $1,500 on average, plus or minus about $500." Any smart high school student would understand that sentence, no matter how sophisticated the statistical model and powerful the computers used to produce it. The sentence is substantively informative because it conveys a key quantity of interest in terms the reader wants to know. At the same time, the sentence indicates how uncertain the researcher is about the estimated quantity of interest. Inferences are never certain, so any honest presentation of statistical results must include some qualifier, such as "plus or minus $500" in the present example. Our computer program, "CLARIFY: Software for Interpreting and Presenting Statistical Results," designed to implement the methods described in this article, is available at http://GKing.Harvard.Edu, and won the Okidata Best Research Software Award for 1999. We thank Bruce Bueno de Mesquita, Jorge Domínguez, Geoff Garrett, Jay

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Gary King

Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference

MatchIt: Nonparametric Preprocessing for Parametric Causal Inference

Causal Inference without Balance Checking: Coarsened Exact Matching

Logistic Regression in Rare Events Data

Computational Social Science

The Parable of Google Flu: Traps in Big Data Analysis

AmeliaII: A Program for Missing Data

Making the Most of Statistical Analyses: Improving Interpretation and Presentation

Contact Info

Product

Resources

About