This paper proposes a new approach to sparsity called the horseshoe estimator. The horseshoe is a close cousin of other widely used Bayes rules arising from, for example, double-exponential and Cauchy priors, in that it is a member of the same family of multivariate scale mixtures of normals. But the horseshoe enjoys a number of advantages over existing approaches, including its robustness, its adaptivity to different sparsity patterns, and its analytical tractability. We prove two theorems that formally characterize both the horseshoe's adeptness at large outlying signals, and its super-efficient rate of convergence to the correct estimate of the sampling density in sparse situations. Finally, using a combination of real and simulated data, we show that the horseshoe estimator corresponds quite closely to the answers one would get by pursuing a full Bayesian model-averaging approach using a discrete mixture prior to model signals and noise.
We propose a new data-augmentation strategy for fully Bayesian inference in models with binomial likelihoods. The approach appeals to a new class of Pólya-Gamma distributions, which are constructed in detail. A variety of examples are presented to show the versatility of the method, including logistic regression, negative binomial regression, nonlinear mixed-effects models, and spatial models for count data. In each case, our data-augmentation strategy leads to simple, effective methods for posterior inference that: (1) circumvent the need for analytic approximations, numerical integration, or Metropolis-Hastings; and (2) outperform other known data-augmentation strategies, both in ease of use and in computational efficiency. All methods, including an efficient sampler for the Pólya-Gamma distribution, are implemented in the R package BayesLogit.In the technical supplement appended to the end of the paper, we provide further details regarding the generation of Pólya-Gamma random variables; the empirical benchmarks reported in the main manuscript; and the extension of the basic dataaugmentation framework to contingency tables and multinomial outcomes.
This paper examines continuous-time stochastic volatility models incorporating jumps in returns and volatility. We develop a likelihood-based estimation strategy and provide estimates of parameters, spot volatility, jump times, and jump sizes using S&P 500 and Nasdaq 100 index returns. Estimates of jump times, jump sizes, and volatility are particularly useful for identifying the effects of these factors during periods of market stress, such as those in 1987, 1997, and 1998. Using formal and informal diagnostics, we ¢nd strong evidence for jumps in volatility and jumps in returns. Finally, we study how these factors and estimation risk impact option pricing.
We study the classic problem of choosing a prior distribution for a location parameter β = (β 1 ,. .. , βp) as p grows large. First, we study the standard "global-local shrinkage" approach, based on scale mixtures of normals. Two theorems are presented which characterize certain desirable properties of shrinkage priors for sparse problems. Next, we review some recent results showing how Lévy processes can be used to generate infinite-dimensional versions of standard normal scale-mixture priors, along with new priors that have yet to be seriously studied in the literature. This approach provides an intuitive framework both for generating new regularization penalties and shrinkage rules, and for performing asymptotic analysis on existing models.
We develop a deep learning model to predict traffic flows. The main contribution is development of an architecture that combines a linear model that is fitted using 1 regularization and a sequence of tanh layers. The challenge of predicting traffic flows are the sharp nonlinearities due to transitions between free flow, breakdown, recovery and congestion. We show that deep learning architectures can capture these nonlinear spatio-temporal effects. The first layer identifies spatio-temporal relations among predictors and other layers model nonlinear relations. We illustrate our methodology on road sensor data from Interstate I-55 and predict traffic flows during two special events; a Chicago Bears football game and an extreme snowstorm event. Both cases have sharp traffic flow regime changes, occurring very suddenly, and we show how deep learning provides precise short term traffic flow predictions.
This paper argues that the half-Cauchy distribution should replace the inverse-Gamma distribution as a default prior for a top-level scale parameter in Bayesian hierarchical models, at least for cases where a proper prior is necessary. Our arguments involve a blend of Bayesian and frequentist reasoning, and are intended to complement the original case made by Gelman (2006) in support of the folded-t family of priors. First, we generalize the half-Cauchy prior to the wider class of hypergeometric inverted-beta priors. We derive expressions for posterior moments and marginal densities when these priors are used for a top-level normal variance in a Bayesian hierarchical model. We go on to prove a proposition that, together with the results for moments and marginals, allows us to characterize the frequentist risk of the Bayes estimators under all global-shrinkage priors in the class. These theoretical results, in turn, allow us to study the frequentist properties of the half-Cauchy prior versus a wide class of alternatives. The half-Cauchy occupies a sensible "middle ground" within this class: it performs very well near the origin, but does not lead to drastic compromises in other parts of the parameter space. This provides an alternative, classical justification for the repeated, routine use of this prior. We also consider situations where the underlying mean vector is sparse, where we argue that the usual conjugate choice of an inverse-gamma prior is particularly inappropriate, and can lead to highly distorted posterior inferences. Finally, we briefly summarize some open issues in the specification of default priors for scale terms in hierarchical models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.