We consider a problem of recovering a high-dimensional vector $\mu$ observed
in white noise, where the unknown vector $\mu$ is assumed to be sparse. The
objective of the paper is to develop a Bayesian formalism which gives rise to a
family of $l_0$-type penalties. The penalties are associated with various
choices of the prior distributions $\pi_n(\cdot)$ on the number of nonzero
entries of $\mu$ and, hence, are easy to interpret. The resulting Bayesian
estimators lead to a general thresholding rule which accommodates many of the
known thresholding and model selection procedures as particular cases
corresponding to specific choices of $\pi_n(\cdot)$. Furthermore, they achieve
optimality in a rather general setting under very mild conditions on the prior.
We also specify the class of priors $\pi_n(\cdot)$ for which the resulting
estimator is adaptively optimal (in the minimax sense) for a wide range of
sparse sequences and consider several examples of such priors.Comment: Published in at http://dx.doi.org/10.1214/009053607000000226 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
We consider a Bayesian approach to model selection in Gaussian linear regression, where the number of predictors might be much larger than the number of observations. From a frequentist view, the proposed procedure results in the penalized least squares estimation with a complexity penalty associated with a prior on the model size. We investigate the optimality properties of the resulting model selector. We establish the oracle inequality and specify conditions on the prior that imply its asymptotic minimaxity within a wide range of sparse and dense settings for "nearlyorthogonal" and "multicollinear" designs.
We consider high-dimensional binary classification by sparse logistic regression. We propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic bounds for its misclassification excess risk. To assess its tightness we establish the corresponding minimax lower bounds. The bounds can be reduced under the additional low-noise condition. The proposed complexity penalty is remarkably related to the VC-dimension of a set of sparse linear classifiers. Implementation of any complexity penalty-based criterion, however, requires a combinatorial search over all possible models. To find a model selection procedure computationally feasible for high-dimensional data, we extend the Slope estimator for logistic regression and show that under an additional weighted restricted eigenvalue condition it is rate-optimal in the minimax sense.
We consider model selection in generalized linear models (GLM) for high-dimensional data and propose a wide class of model selection criteria based on penalized maximum likelihood with a complexity penalty on the model size. We derive a general nonasymptotic upper bound for the Kullback-Leibler risk of the resulting estimators and establish the corresponding minimax lower bounds for sparse GLM. For the properly chosen (nonlinear) penalty, the resulting penalized maximum likelihood estimator is shown to be asymptotically minimax and adaptive to the unknown sparsity. We discuss also possible extensions of the proposed approach to model selection in GLM under additional structural constraints and aggregation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.