2021
DOI: 10.1080/00273171.2021.1891856
|View full text |Cite
|
Sign up to set email alerts
|

Large-Scale Survey Data Analysis with Penalized Regression: A Monte Carlo Simulation on Missing Categorical Predictors

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
17
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

3
3

Authors

Journals

citations
Cited by 7 publications
(19 citation statements)
references
References 71 publications
2
17
0
Order By: Relevance
“…Beyond producing interpretable models, the Enet and Mnet models of this study were comparable to RF models in terms of prediction. Likewise, multiple studies across diverse disciplines reported that linear models are comparable to RF (e.g., [53]- [55]) or even better than RF (e.g., [19], [20], [56]). These studies with ours have in common that the variables were pre-selected based on previous research.…”
Section: A Regularization and Learning Analyticsmentioning
confidence: 98%
See 3 more Smart Citations
“…Beyond producing interpretable models, the Enet and Mnet models of this study were comparable to RF models in terms of prediction. Likewise, multiple studies across diverse disciplines reported that linear models are comparable to RF (e.g., [53]- [55]) or even better than RF (e.g., [19], [20], [56]). These studies with ours have in common that the variables were pre-selected based on previous research.…”
Section: A Regularization and Learning Analyticsmentioning
confidence: 98%
“…Regularization produces biased estimates, and significance testing is performed on unbiased estimates. Special techniques such as post-selection inference (e.g., [43]) are required to perform statistical testing after regularization, but currently only available with LASSO [19], [20]. Instead of statistical testing, we iterated data splitting and prediction modeling, and obtained selection counts as the criterion for variable importance; variables selected more often bear more importance than variables selected less often [19], [20].…”
Section: ) Cross-validation (Cv)mentioning
confidence: 99%
See 2 more Smart Citations
“…To this end, based on measurements collected by the sensor network of a photovoltaic production plant, the paper proposes Monte Carlo (MC) simulation as the pre-processing stage to deal with outliers before applying PCA [4,5]. In this respect, the proposed approach is shown to be a valid alternative to relying on the classical Interquartile Range (IQR) method in order to omit outliers when applying PCA for anomaly detection purposes.…”
Section: Introductionmentioning
confidence: 99%