Motivated by an entropy inequality, we propose for the first time a penalized profile likelihood method for simultaneously selecting significant variables and estimating unknown coefficients in multiple linear regression models in this article. The new method is robust to outliers or errors with heavy tails and works well even for error with infinite variance. Our proposed approach outperforms the adaptive lasso in both theory and practice. It is observed from the simulation studies that (i) the new approach possesses higher probability of correctly selecting the exact model than the least absolute deviation lasso and the adaptively penalized composite quantile regression approach and (ii) exact model selection via our proposed approach is robust regardless of the error distribution. An application to a real dataset is also provided.
Decision trees have attracted much attention during the past decades. Previous decision trees include axis-parallel and oblique decision trees; both of them try to find the best splits via exhaustive search or heuristic algorithms in each iteration. Oblique decision trees generally simplify tree structure and take better performance, but are always accompanied with higher computation, as well as the initialization with the best axis-parallel splits. This work presents the Weighted Oblique Decision Tree (WODT) based on continuous optimization with random initialization. We consider different weights of each instance for child nodes at all internal nodes, and then obtain a split by optimizing the continuous and differentiable objective function of weighted information entropy. Extensive experiments show the effectiveness of the proposed algorithm.
LSE has developed LSE Research Online so that users may access research output of the School. Copyright © and Moral Rights for the papers on this site are retained by the individual authors and/or other copyright owners. Users may download and/or print one copy of any article(s) in LSE Research Online to facilitate their private study or for non-commercial research. You may not engage in further distribution of the material or use it for any profit-making activities or any commercial gain. You may freely distribute the URL (http://eprints.lse.ac.uk) of the LSE Research Online website.This document is the author's final accepted version of the journal article. There may be differences between this version and the published version. You are advised to consult the publisher's version if you wish to cite from it. AbstractFor discrete panel data, the dynamic relationship between successive observations is often of interest. We consider a dynamic probit model for short panel data. A problem with estimating the dynamic parameter of interest is that the model contains a large number of nuisance parameters, one for each individual. Heckman proposed to use maximum likelihood estimation of the dynamic parameter, which, however, does not perform well if the individual effects are large. We suggest new estimators for the dynamic parameter, based on the assumption that the individual parameters are random and possibly large. Theoretical properties of our estimators are derived and a simulation study shows they have some advantages compared to Heckman's estimator and the modified profile likelihood estimator(MPL) for fixed effects. subject to1 where I(·) denotes the indicator function, {x it } are k × 1 covariate vectors, τ i is an unknown intercept representing the i-th individual effect, and the autoregressive coefficient γ and the regressive coefficient β are unknown parameters which are assumed to be the same for all individuals. In (1), only the d it and x it are observable. The goal is often to estimate γ and β while the τ i are treated as nuisance parameters. As with most panel data, the number of individuals n is large while the length of observed time period T is small. Therefore the asymptotic approximations are often derived with n → ∞ and T fixed.Model (1) Chamberlain (1980Chamberlain ( , 1985, Honore and Kyriazidou (2000), and Lancaster (2002) considered the models with logistic distributed ǫ it . They proposed a consistent estimator of γ and derived its convergence rate. Bartolucci and Farcomeni (2009) and Bartolucci and Nigro (2010) considered some extended versions of dynamic logit models with heterogeneity beyond those reflected by the covariates in the models. A standard method to deal with incidental parameter problems is to use a conditional likelihood to eliminate the incidental parameters by conditioning on sufficient statistics for those parameters;see, e.g. Chamberlain (1980), Bartolucci and Nigro (2010), and also Lancaster (2000).An attractive alternative is to treat individual effects τ i as random e...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.