In the setting of nonparametric multivariate regression with unknown error variance σ 2 , we study asymptotic properties of a Bayesian method for estimating a regression function f and its mixed partial derivatives. We use a random series of tensor product of B-splines with normal basis coefficients as a prior for f , and σ is either estimated using the empirical Bayes approach or is endowed with a suitable prior in a hierarchical Bayes approach. We establish pointwise, L2 and L∞-posterior contraction rates for f and its mixed partial derivatives, and show that they coincide with the minimax rates. Our results cover even the anisotropic situation, where the true regression function may have different smoothness in different directions. Using the convergence bounds, we show that pointwise, L2 and L∞-credible sets for f and its mixed partial derivatives have guaranteed frequentist coverage with optimal size. New results on tensor products of B-splines are also obtained in the course. MSC 2010 subject classifications: Primary 62G08; secondary 62G05, 62G15, 62G20 1 imsart-aos ver. 2014/10/16 file: supcredible_rev.tex date: October 9, 2018 arXiv:1411.6716v3 [math.ST] 24 Sep 2015 2 W. W. YOO AND S. GHOSALpointwise, L 2 and L ∞ (supremum) distances. We assume that the true regression function f 0 belongs to an anisotropic Hölder space (see Definition 2.1 below), and the errors under the true distribution are sub-Gaussian.Posterior contraction rates for regression functions in the L 2 -norm are well studied, but results for the stronger L ∞ -norm are limited. Giné and Nickl [14] studied contraction rates in L r -metric, 1 ≤ r ≤ ∞, and obtained optimal rate using conjugacy for the Gaussian white noise model and a rate for density estimation based on a random wavelet series and Dirichlet process mixture using a testing approach. In the same context, Castillo [2] introduced techniques based on semiparametric Bernstein-von Misses (BvM) theorems to obtain optimal L ∞ -contraction rates. Hoffman et al. [17] derived adaptive optimal L ∞ -contraction rate for the white noise model and also for density estimation. Scricciolo [25] applied the techniques of [14] to obtain L ∞ -rates using Gaussian kernel mixtures prior for analytic true densities.De Jonge and van Zanten [9] used finite random series based on tensor products of B-splines to construct a prior for nonparametric regression and derived adaptive L 2 -contraction rate for the regression function in the isotropic case. A BvM theorem for the posterior of σ is treated in [10]. Shen and Ghosal [28,29] used tensor products of B-splines respectively for Bayesian multivariate density estimation and high dimensional density regression in the anisotropic case.Nonparametric confidence bands for an unknown function were considered by [30, 1] and more recently by [6,13,5]. A Bayesian approaches the problem by constructing a credible set with a prescribed posterior probability. It is then natural to ask if the credible set has adequate frequentist coverage for large sample sizes. F...