Variable selection and regression analysis for the prediction of mortality rates associated with foodborne diseases

Amene, Ermias; Hanson, L. A.; Zahn, Elizabeth A; Wild, S. R.; Döpfer, Dörte

doi:10.1017/s0950268815003234

Cited by 8 publications

(7 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…An important consideration when utilizing large gene matrices in predictive modeling is identifying and selecting the smallest possible set of relevant genes that can help achieve good predictive performance, without model overfitting or including features that are irrelevant or redundant to the prediction process ( Guyon et al, 2003 ). In the presence of such data, it is important to employ a statistical approach to select meaningful subsets of predictors for samples with complete data, similar to the approach used by Amene et al (2016) to predict mortality rates associated with foodborne diseases. Elastic Net has previously proven to be effective (with an accuracy of >90%) in identifying genetic features of interest related to lung cancer ( Hughey and Butte, 2015 ).…”

Section: Discussionmentioning

confidence: 99%

Machine learning to predict foodborne salmonellosis outbreaks based on genome characteristics and meteorological trends

Karanth

Patel

Shirmohammadi

et al. 2023

Current Research in Food Science

View full text Add to dashboard Cite

Section: Discussionmentioning

confidence: 99%

Machine learning to predict foodborne salmonellosis outbreaks based on genome characteristics and meteorological trends

Karanth

Patel

Shirmohammadi

et al. 2023

Current Research in Food Science

View full text Add to dashboard Cite

“…The use of variable selection techniques can lead to more accurate predictions, reduce the computational cost of creating the model, and improve the parsimony of the model by eliminating redundant and irrelevant variables. For example, variable selection techniques have been used to build models pertaining to identifying exposure-outcome associations [1] as well as predicting mortality rates [2,3] , psychological strain in teachers [4] , and nomophobia [5] .…”

Section: A Tutorial On Supervised Machine Learning Variable Selection...mentioning

confidence: 99%

A Tutorial on Supervised Machine Learning Variable Selection Methods for the Social and Health Sciences in R

Bain,

Shi,

Loeffelman

et al. 2024

Preprint

View full text Add to dashboard Cite

With recent increases in the size of datasets currently available in the psychological sciences, the need for efficient and effective variable selection techniques has increased. A plethora of techniques exist, yet only a few are used within the psychological sciences (e.g., stepwise regression, which is most common, LASSO, and Elastic Net). The purpose of this tutorial is to increase awareness of the various variable selection methods available in the popular statistical software R, and guide researchers through how each method can be used to select variables in the context of classification using a recent survey-based assessment of misophonia. Specifically, readers will learn about how to implement and interpret results from the LASSO, Elastic Net, a penalized SVM classifier, an implementation of random forest, and the genetic algorithm. The associated code and data implemented in this tutorial are available on OSF to allow for a more interactive experience. This paper is written with the assumption that individuals have at least a basic understanding of R.

show abstract

Section: A Tutorial On Supervised Machine Learning Variable Selection...mentioning

confidence: 99%

A Tutorial on Supervised Machine Learning Variable Selection Methods for the Social and Health Sciences in R

Bain,

Shi,

Ethridge

et al. 2024

Preprint

View full text Add to dashboard Cite

With recent increases in the size of datasets currently available in the behavioral and health sciences, the need for efficient and effective variable selection techniques has increased. A plethora of techniques exist, yet only a few are used within the psychological sciences (e.g., stepwise regression, which is most common, the LASSO, and Elastic Net). The purpose of this tutorial is to increase awareness of the various variable selection methods available in the popular statistical software R, and guide researchers through how each method can be used to select variables in the context of classification using a recent survey-based assessment of misophonia. Specifically, readers will learn about how to implement and interpret results from the LASSO, Elastic Net, a penalized SVM classifier, an implementation of random forest, and the genetic algorithm. The associated code and data implemented in this tutorial are available on OSF to allow for a more interactive experience. This paper is written with the assumption that individuals have at least a basic understanding of R.

show abstract

Variable selection and regression analysis for the prediction of mortality rates associated with foodborne diseases

Cited by 8 publications

References 34 publications

Machine learning to predict foodborne salmonellosis outbreaks based on genome characteristics and meteorological trends

Machine learning to predict foodborne salmonellosis outbreaks based on genome characteristics and meteorological trends

A Tutorial on Supervised Machine Learning Variable Selection Methods for the Social and Health Sciences in R

A Tutorial on Supervised Machine Learning Variable Selection Methods for the Social and Health Sciences in R

Contact Info

Product

Resources

About