Yi Lin scite author profile

We consider the problem of selecting grouped variables (factors) for accurate prediction in regression. Such a problem arises naturally in many practical situations with the multifactor analysis-of-variance problem as the most important and well-known example. Instead of selecting factors by stepwise backward elimination, we focus on the accuracy of estimation and consider extensions of the lasso, the LARS algorithm and the non-negative garrotte for factor selection. The lasso, the LARS algorithm and the non-negative garrotte are recently proposed regression methods that can be used to select individual variables. We study and propose efficient algorithms for the extensions of these methods for factor selection and show that these extensions give superior performance to the traditional stepwise backward elimination method in factor selection problems. We study the similarities and the differences between these methods. Simulations and real examples are used to illustrate the methods. Copyright 2006 Royal Statistical Society.

show abstract

Model selection and estimation in the Gaussian graphical model

Yuan

2007

View full text Add to dashboard Cite

We propose penalized likelihood methods for estimating the concentration matrix in the Gaussian graphical model. The methods lead to a sparse and shrinkage estimator of the concentration matrix that is positive definite, and thus conduct model selection and estimation simultaneously. The implementation of the methods is nontrivial because of the positive definite constraint on the concentration matrix, but we show that the computation can be done effectively by taking advantage of the efficient maxdet algorithm developed in convex optimization. We propose a BIC-type criterion for the selection of the tuning parameter in the penalized likelihood methods. The connection between our methods and existing methods is illustrated. Simulations and real examples demonstrate the competitive performance of the new methods.

show abstract

Component selection and smoothing in multivariate nonparametric regression

Lin¹,

2006

View full text Add to dashboard Cite

We propose a new method for model selection and model fitting in multivariate nonparametric regression models, in the framework of smoothing spline ANOVA. The "COSSO" is a method of regularization with the penalty functional being the sum of component norms, instead of the squared norm employed in the traditional smoothing spline method. The COSSO provides a unified framework for several recent proposals for model selection in linear models and smoothing spline ANOVA models. Theoretical properties, such as the existence and the rate of convergence of the COSSO estimator, are studied. In the special case of a tensor product design with periodic functions, a detailed analysis reveals that the COSSO does model selection by applying a novel soft thresholding type operation to the function components. We give an equivalent formulation of the COSSO estimator which leads naturally to an iterative algorithm. We compare the COSSO with MARS, a popular method that builds functional ANOVA models, in simulations and real examples. The COSSO method can be extended to classification problems and we compare its performance with those of a number of machine learning algorithms on real datasets. The COSSO gives very competitive performance in these studies. . This reprint differs from the original in pagination and typographic detail. 1 COMPONENT SELECTION AND SMOOTHING 3 that is more suitable for computation. In Section 4 we consider the special case of a tensor product design with periodic functions. A detailed analysis in this special case sheds light on the mechanism of the COSSO in terms of component selection in SS-ANOVA. In particular, we show in this case that the COSSO does model selection by applying a novel soft thresholding-type operation to the function components. In Section 5 we present a COSSO algorithm that is based on iterating between the smoothing spline method and the nonnegative garrote [1]. In Section 6 we consider the choice of the tuning parameter. Simulations are given in Section 7, where we compare the COSSO with the MARS procedure developed by Friedman [8], a popular algorithm that builds functional ANOVA models. The COSSO can be naturally extended to perform classification tasks, and we also compare the performance of the COSSO with that of many machine learning methods on some benchmark datasets. These real examples are given in Section 8, and Section 9 contains a discussion. The proofs are given in the Appendix.2. The COSSO in smoothing spline ANOVA.2.1. The smoothing spline ANOVA. In the commonly used smoothing spline ANOVA model over X = [0, 1] d , it is assumed that f ∈ F , where F is a reproducing kernel Hilbert space (RKHS) corresponding to the decomposition (1). Let H j be a function space of functions of x (j) over [0, 1] such that H j = {1} ⊕H j . Then the tensor product space of the H j 's is d j=1

show abstract

Random Forests and Adaptive Nearest Neighbors

Lin

Jeon

2006

Journal of the American Statistical Association

388

309

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yi Lin

Model Selection and Estimation in Regression with Grouped Variables

Model selection and estimation in the Gaussian graphical model

Component selection and smoothing in multivariate nonparametric regression

Random Forests and Adaptive Nearest Neighbors

Contact Info

Product

Resources

About