A rare variant in APOC3 is associated with plasma triglyceride and VLDL levels in Europeans

Recent work has suggested that the performance of prediction models for complex traits may depend on the architecture of the target traits. Here we compared several prediction models with respect to their ability of predicting phenotypes under various statistical architectures of gene action: (1) purely additive, (2) additive and dominance, (3) additive, dominance, and two-locus epistasis, and (4) purely epistatic settings. Simulation and a real chicken dataset were used. Fourteen prediction models were compared: BayesA, BayesB, BayesC, Bayesian LASSO, Bayesian ridge regression, elastic net, genomic best linear unbiased prediction, a Gaussian process, LASSO, random forests, reproducing kernel Hilbert spaces regression, ridge regression (best linear unbiased prediction), relevance vector machines, and support vector machines. When the trait was under additive gene action, the parametric prediction models outperformed non-parametric ones. Conversely, when the trait was under epistatic gene action, the non-parametric prediction models provided more accurate predictions. Thus, prediction models must be selected according to the most probably underlying architecture of traits. In the chicken dataset examined, most models had similar prediction performance. Our results corroborate the view that there is no universally best prediction models, and that the development of robust prediction models is an important research objective.

show abstract

Random noise and perturbation of copulas

Mesiar¹,

Sheikhi²,

Komorníková³

2019

Kybernetika

View full text Add to dashboard Cite

show abstract

Nonlinear Random Forest Classification, a Copula-Based Approach

Mesiar

Sheikhi

2021

Applied Sciences

View full text Add to dashboard Cite

In this work, we use a copula-based approach to select the most important features for a random forest classification. Based on associated copulas between these features, we carry out this feature selection. We then embed the selected features to a random forest algorithm to classify a label-valued outcome. Our algorithm enables us to select the most relevant features when the features are not necessarily connected by a linear function; also, we can stop the classification when we reach the desired level of accuracy. We apply this method on a simulation study as well as a real dataset of COVID-19 and for a diabetes dataset.

show abstract

On a generalization of the test of endogeneity in a two stage least squares estimation

Sheikhi

Bahador

Arashi

2020

Journal of Applied Statistics

View full text Add to dashboard Cite

A comprehensive family of copulas to model bivariate random noise and perturbation

Sheikhi

Amirzadeh

Mesiar

2021

Fuzzy Sets and Systems

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ayyub Sheikhi

Predictive ability of genome-assisted statistical models under various forms of gene action

Random noise and perturbation of copulas

Nonlinear Random Forest Classification, a Copula-Based Approach

On a generalization of the test of endogeneity in a two stage least squares estimation

A comprehensive family of copulas to model bivariate random noise and perturbation

Contact Info

Product

Resources

About