2016
DOI: 10.1017/s0950268815003234
|View full text |Cite
|
Sign up to set email alerts
|

Variable selection and regression analysis for the prediction of mortality rates associated with foodborne diseases

Abstract: The purpose of this study was to apply a novel statistical method for variable selection and a model-based approach for filling data gaps in mortality rates associated with foodborne diseases using the WHO Vital Registration mortality dataset. Correlation analysis and elastic net regularization methods were applied to drop redundant variables and to select the most meaningful subset of predictors. Whenever predictor data were missing, multiple imputation was used to fill in plausible values. Cluster analysis w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 34 publications
0
7
0
Order By: Relevance
“…An important consideration when utilizing large gene matrices in predictive modeling is identifying and selecting the smallest possible set of relevant genes that can help achieve good predictive performance, without model overfitting or including features that are irrelevant or redundant to the prediction process ( Guyon et al, 2003 ). In the presence of such data, it is important to employ a statistical approach to select meaningful subsets of predictors for samples with complete data, similar to the approach used by Amene et al (2016) to predict mortality rates associated with foodborne diseases. Elastic Net has previously proven to be effective (with an accuracy of >90%) in identifying genetic features of interest related to lung cancer ( Hughey and Butte, 2015 ).…”
Section: Discussionmentioning
confidence: 99%
“…An important consideration when utilizing large gene matrices in predictive modeling is identifying and selecting the smallest possible set of relevant genes that can help achieve good predictive performance, without model overfitting or including features that are irrelevant or redundant to the prediction process ( Guyon et al, 2003 ). In the presence of such data, it is important to employ a statistical approach to select meaningful subsets of predictors for samples with complete data, similar to the approach used by Amene et al (2016) to predict mortality rates associated with foodborne diseases. Elastic Net has previously proven to be effective (with an accuracy of >90%) in identifying genetic features of interest related to lung cancer ( Hughey and Butte, 2015 ).…”
Section: Discussionmentioning
confidence: 99%
“…The use of variable selection techniques can lead to more accurate predictions, reduce the computational cost of creating the model, and improve the parsimony of the model by eliminating redundant and irrelevant variables. For example, variable selection techniques have been used to build models pertaining to identifying exposure-outcome associations [1] as well as predicting mortality rates [2,3] , psychological strain in teachers [4] , and nomophobia [5] .…”
Section: A Tutorial On Supervised Machine Learning Variable Selection...mentioning
confidence: 99%
“…The use of variable selection techniques can lead to more accurate predictions, reduce the computational cost of creating the model, and improve the parsimony of the model by eliminating redundant and irrelevant variables. For example, variable selection techniques have been used to build models pertaining to identifying exposure-outcome associations [1] as well as predicting mortality rates [2,3] , psychological strain in teachers [4] , and nomophobia [5] .…”
Section: A Tutorial On Supervised Machine Learning Variable Selection...mentioning
confidence: 99%