Adaptive Empirical Bayesian Smoothing Splines

Serra, Paulo; Krivobokova, Tatyana

doi:10.1214/16-ba997

Cited by 19 publications

(25 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We study frequentist behavior of the posterior distributions and the resulting credible sets for f and its mixed partial derivatives, in terms of norm using this weak notion of BvM theorem is considered in [4]. Adaptive L 2 -credible regions with adequate frequentist coverage are constructed using the empirical Bayes approach in [34] for the Gaussian white noise model and in [27] for the nonparametric regression model using smoothing splines. In the setting of the Gaussian white noise model, Ray [23] constructed adaptive L 2 -credible sets using a weak BvM theorem, and also adaptive L ∞ -credible band using a spike and slab prior.…”

Section: Introduction Consider the Nonparametric Regression Modelmentioning

confidence: 99%

Supremum norm posterior contraction and credible sets for nonparametric multivariate regression

Yoo¹,

Ghosal²

2016

Ann. Statist.

View full text Add to dashboard Cite

In the setting of nonparametric multivariate regression with unknown error variance σ 2 , we study asymptotic properties of a Bayesian method for estimating a regression function f and its mixed partial derivatives. We use a random series of tensor product of B-splines with normal basis coefficients as a prior for f , and σ is either estimated using the empirical Bayes approach or is endowed with a suitable prior in a hierarchical Bayes approach. We establish pointwise, L2 and L∞-posterior contraction rates for f and its mixed partial derivatives, and show that they coincide with the minimax rates. Our results cover even the anisotropic situation, where the true regression function may have different smoothness in different directions. Using the convergence bounds, we show that pointwise, L2 and L∞-credible sets for f and its mixed partial derivatives have guaranteed frequentist coverage with optimal size. New results on tensor products of B-splines are also obtained in the course. MSC 2010 subject classifications: Primary 62G08; secondary 62G05, 62G15, 62G20 1 imsart-aos ver. 2014/10/16 file: supcredible_rev.tex date: October 9, 2018 arXiv:1411.6716v3 [math.ST] 24 Sep 2015 2 W. W. YOO AND S. GHOSALpointwise, L 2 and L ∞ (supremum) distances. We assume that the true regression function f 0 belongs to an anisotropic Hölder space (see Definition 2.1 below), and the errors under the true distribution are sub-Gaussian.Posterior contraction rates for regression functions in the L 2 -norm are well studied, but results for the stronger L ∞ -norm are limited. Giné and Nickl [14] studied contraction rates in L r -metric, 1 ≤ r ≤ ∞, and obtained optimal rate using conjugacy for the Gaussian white noise model and a rate for density estimation based on a random wavelet series and Dirichlet process mixture using a testing approach. In the same context, Castillo [2] introduced techniques based on semiparametric Bernstein-von Misses (BvM) theorems to obtain optimal L ∞ -contraction rates. Hoffman et al. [17] derived adaptive optimal L ∞ -contraction rate for the white noise model and also for density estimation. Scricciolo [25] applied the techniques of [14] to obtain L ∞ -rates using Gaussian kernel mixtures prior for analytic true densities.De Jonge and van Zanten [9] used finite random series based on tensor products of B-splines to construct a prior for nonparametric regression and derived adaptive L 2 -contraction rate for the regression function in the isotropic case. A BvM theorem for the posterior of σ is treated in [10]. Shen and Ghosal [28,29] used tensor products of B-splines respectively for Bayesian multivariate density estimation and high dimensional density regression in the anisotropic case.Nonparametric confidence bands for an unknown function were considered by [30, 1] and more recently by [6,13,5]. A Bayesian approaches the problem by constructing a credible set with a prescribed posterior probability. It is then natural to ask if the credible set has adequate frequentist coverage for large sample sizes. F...

show abstract

Section: Introduction Consider the Nonparametric Regression Modelmentioning

confidence: 99%

Supremum norm posterior contraction and credible sets for nonparametric multivariate regression

Yoo¹,

Ghosal²

2016

Ann. Statist.

View full text Add to dashboard Cite

show abstract

“…We emphasize that the scope of the DDM P(·|X) in delivering the minimax rates extends further than just these four scales. Theorem 4 implies the minimax results of type (2) for all scales for which (3) holds; for example, in view of (21), for all ellipsoids E(a) and hyperrectangles H(a) defined by (20). Other smoothness scales can also be treated.…”

Section: Discussionmentioning

confidence: 82%

“…Recall the definitions (20) of ellipsoid E(a) and hyperrectangle H(a). First consider the hyperrectangles H(a).…”

Section: A11 Proof Of (21)mentioning

confidence: 99%

“…A way to achieve adaptivity is to remove the so called deceptive parameters (in [21] they are called inconvenient truths) from Θ, i.e., consider a strictly smaller set Θ ′ cov ⊂ Θ. Examples are: Θ ′ cov = Θ ss , the so called self-similar parameters (related to Sobolev/Besov scales) introduced in [18] and later studied in [5], [6], [21], [17], [20]; and Θ ′ cov = Θ pt , a more general class of polished tail parameters introduced in [21]. More literature on adaptive minimax confidence sets: [16], [3], [18], [13], [10], [11], [14], [5], [6], [17], [21,22].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

On coverage and local radial rates of credible sets

Belitser¹

2017

Ann. Statist.

View full text Add to dashboard Cite

For a general statistical model, we introduce the notion of data dependent measure (DDM) on the model parameter. Typical examples of DDM are the posterior distributions. Like for posteriors, the quality of a DDM is characterized by the contraction rate which we allow to be local, i.e., depending on the parameter. We construct confidence sets as DDM-credible sets and address the issue of optimality of such sets, via a trade-off between its "size" (the local radial rate) and its coverage probability. In the mildly ill-posed inverse signal-in-white-noise model, we construct a DDM as empirical Bayes posterior with respect to a certain prior, and define its (default) credible set. Then we introduce excessive bias restriction (EBR), more general than self-similarity and polished tail condition recently studied in the literature. Under EBR, we establish the confidence optimality of our credible set with some local (oracle) radial rate. We also derive the oracle estimation inequality and the oracle DDM-contraction rate, non-asymptotically and uniformly in ℓ 2 . The obtained local results are more powerful than global: adaptive minimax results for a number of smoothness scales follow as consequence, in particular, the ones considered by [21]. MSC2010 subject classification: primary 62G15, 62C05; secondary 62G99.

show abstract

“…The authors of [12] have followed up their work with investigating adaptive pointwise credible sets using rescaled (integrated) Brownian motion as a prior in the nonparametric regression model. Random smoothing spline priors with Gaussian weights on the spline coefficients are shown in [11] to give honest credible sets in the nonparametric regression problem under the self-similarity condition.…”

Section: Choice Of Basismentioning

confidence: 99%