2020
DOI: 10.3390/e22050543
|View full text |Cite
|
Sign up to set email alerts
|

Prediction and Variable Selection in High-Dimensional Misspecified Binary Classification

Abstract: In this paper, we consider prediction and variable selection in the misspecified binary classification models under the high-dimensional scenario. We focus on two approaches to classification, which are computationally efficient, but lead to model misspecification. The first one is to apply penalized logistic regression to the classification data, which possibly do not follow the logistic model. The second method is even more radical: we just treat class labels of objects as they were numbers and apply penaliz… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
3
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 35 publications
0
3
0
Order By: Relevance
“…However, the recent availability of more informative databases obtained from EHR opens up new research opportunities. These current databases containing records of hundreds of thousands of appointments allow the use of modern predictive techniques such as deep neural networks or novel binary classification algorithms for high-dimensional settings, such as [ 65 , 66 ]. A second research line consists of developing and incorporating strategies that reduce the negative effects of class imbalance.…”
Section: Discussionmentioning
confidence: 99%
“…However, the recent availability of more informative databases obtained from EHR opens up new research opportunities. These current databases containing records of hundreds of thousands of appointments allow the use of modern predictive techniques such as deep neural networks or novel binary classification algorithms for high-dimensional settings, such as [ 65 , 66 ]. A second research line consists of developing and incorporating strategies that reduce the negative effects of class imbalance.…”
Section: Discussionmentioning
confidence: 99%
“…Ref. [4], similarly to [2], deals with the classification problem of a binary variable under misspecification. It focuses on establishing a general upper bound of excess risk, i.e., the difference between the risk of the linear classifier βT x, obtained as a minimizer of the penalized empirical risk pertaining to convex function φ, and the Bayes risk in such a case (Theorem 1).…”
mentioning
confidence: 99%
“…• Rendall et al [26]-extensive comparison of large scale data driven prediction methods based on VS and machine learning; • Marcjasz et al [27]-to electricity price forecasting; • Santi et al [28]-to predict mathematics scores of students; • Karim et al [29]-to predict post-operative outcomes of cardiac surgery patients; • Kim and Kang [30]-to faulty wafer detection in semiconductor manufacturing; • Furma ńczyk and Rejchel [31]-to high-dimensional binary classification problems; • Fouad and Loáiciga [5]-to predict percentile flows using inflow duration curve and regression models; • Ata Tutkun and Kayhan Atilgan [32]-investigated VS models in Cox regression, a multivariate model; • Mehmood et al [33]-compared several VS approaches in partial least-squares regression tasks;…”
mentioning
confidence: 99%