Kernel regression is widely used in biology and economy, because it is more adaptable to complex laws than linear regression, and it has better interpretability than many methods in deep learning. In highdimensional area, l1-norm penalization is a common method for variable selection, which may be derived from the excellent performance of the lasso algorithm. Although it seems natural to generalize from consistency in variable selection to consistency in kernel selection, there are still many details that need to be taken seriously, e.g. the lower eigenvalue condition. The consistence condition of kernel selection in l1-norm regular linear kernel regression is given, including the prediction error. In simulation study, the consistency of different levels of λn and dimension of features is carefully checked. Finally, the kernel selection method was applied to high-risk area exploration of Covid-19, with the dataset provided by the US Centers for Disease Control and Prevention(CDC). Declarations of interest:The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
With the long-term outbreak of the Covid-19 around the world, identifying high-risk areas is becoming a new research boom. In this paper, we propose a novel regression method namely Regular Linear Kernel Regression(RLKR) for Covid-19 high-risk areas Exploration. We explain in detail how the canonical linear kernel regression method is linked to the identification of high-risk areas for Covid-19. Further more, The consistence condition of Kernel Selection, which is closely related to the identification of high-risk areas, is given with two mild assumptions. Finally, the RLKR method was verified by simulation experiments and applied for Covid-19 high-risk area Exploration.
In classification problems, the occurrence of abnormal observations is often encountered. How to obtain a stable model to deal with outliers has always been a subject of widespread concern. In this article, we draw on the ideas of the AdaBoosting algorithm and propose a asymptotically linear loss function, which makes the output function more stable for contaminated samples, and two boosting algorithms were designed, based on two different way of updating, to handle outliers. In addition, a skill for overcoming the instability of Newton's method when dealing with weak convexity is introduced. Several samples, where outliers were artificially added, show that the Discrete L-AdaBoost and Real L-AdaBoost Algorithms find the boundary of each category consistently under the condition where data is contaminated. In real world data sets, we show the effectiveness of suggested algorithms by comparison with other two ensemble learning methods, especially for large dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.