In this paper, we study the statistical behaviour of the Exponentially Weighted Aggregate (EWA) in the problem of high-dimensional regression with fixed design. Under the assumption that the underlying regression vector is sparse, it is reasonable to use the Laplace distribution as a prior. The resulting estimator and, specifically, a particular instance of it referred to as the Bayesian lasso, was already used in the statistical literature because of its computational convenience, even though no thorough mathematical analysis of its statistical properties was carried out. The present work fills this gap by establishing sharp oracle inequalities for the EWA with the Laplace prior. These inequalities show that if the temperature parameter is small, the EWA with the Laplace prior satisfies the same type of oracle inequality as the lasso estimator does, as long as the quality of estimation is measured by the prediction loss. Extensions of the proposed methodology to the problem of prediction with low-rank matrices are considered.MSC 2010 subject classifications: Primary 62J05; secondary 62H12.
In this paper we revisit the risk bounds of the lasso estimator in the context of transductive and semi-supervised learning. In other terms, the setting under consideration is that of regression with random design under partial labeling. The main goal is to obtain user-friendly bounds on the off-sample prediction risk. To this end, the simple setting of bounded response variable and bounded (high-dimensional) covariates is considered. We propose some new adaptations of the lasso to these settings and establish oracle inequalities both in expectation and in deviation. These results provide non-asymptotic upper bounds on the risk that highlight the interplay between the bias due to the mis-specification of the linear model, the bias due to the approximate sparsity and the variance. They also demonstrate that the presence of a large number of unlabeled features may have significant positive impact in the situations where the restricted eigenvalue of the design matrix vanishes or is very small.MSC 2010 subject classifications: Primary 62H30; secondary 62G08.
A high throughput phenotyping tool for seed germination, the ScreenSeed technology, was developed with the aim of screening genotype responsiveness and chemical drugs. This technology was presently used with Arabidopsis thaliana seeds to allow characterizing seed samples germination behavior by incubating seeds in 96-well microplates under defined conditions and detecting radicle protrusion through the seed coat by automated image analysis. This study shows that this technology provides a fast procedure allowing to handle thousands of seeds without compromising repeatability or accuracy of the germination measurements. Potential biases of the experimental protocol were assessed through statistical analyses of germination kinetics. Comparison of the ScreenSeed procedure with commonly used germination tests based upon visual scoring displayed very similar germination kinetics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.