New variable selection strategy for analysis of high-dimensional DNA methylation data

Choi, Jae Weon; Kim, Kipoong; Sun, Hokeun

doi:10.1142/s0219720018500105

Cited by 10 publications

(9 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Therefore, computation of selection probability does not require to choose the optimal tuning parameter. For this reason, selection probability has been widely used to identify top ranked genetic variants or genes that are associated with a phenotype outcome for analysis of high-dimensional genomic data (Alexander and Lange, 2011;Sun et al, 2017;Choi et al, 2018;Kim and Sun, 2019). However, selection probability has never been applied to multivariate regularization methods.…”

Section: Methodsmentioning

confidence: 99%

“…Note that the q-dimensional vectorβ j (I m ; λ, α) has either all zero values or all nonzero values because of a group lasso penalty in (2.2). Similarly, selection probability of the individual elastic-net method in (2.1) can be computed and details are described by Sun et al, 2017;Choi et al, 2018;Kim and Sun, 2019). However, the selection probability of the j th variant of the elastic-net can have up to q different values since the coefficient vector is estimated from each of q phenotype outcomes.…”

Section: Methodsmentioning

confidence: 99%

“…They are individual elastic-net, unified elastic-net, and multi-response elastic-net, respectively. To obtain single nucleotide polymorphism (SNP) data, we applied the same process used in Choi et al (2018). Specifically, we first generated two p-dimensional vectors (a i1 , .…”

Section: Simulation Studiesmentioning

confidence: 99%

“…Alternatively, regularization methods have been widely applied to analysis of high-dimensional genomic data (Wu et al, 2009;Zhou et al, 2010;Wang, 2012, 2013;Sun et al, 2017;Choi et al, 2018). They have a great advantage over individual tests since genetic effects of all variants in analysis can be computed in one regression framework.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Selection probability of multivariate regularization to identify pleiotropic variants in genetic association studies

Kim

Sun

2020

CSAM

Self Cite

View full text Add to dashboard Cite

In genetic association studies, pleiotropy is a phenomenon where a variant or a genetic region affects multiple traits or diseases. There have been many studies identifying cross-phenotype genetic associations. But, most of statistical approaches for detection of pleiotropy are based on individual tests where a single variant association with multiple traits is tested one at a time. These approaches fail to account for relations among correlated variants. Recently, multivariate regularization methods have been proposed to detect pleiotropy in analysis of high-dimensional genomic data. However, they suffer a problem of tuning parameter selection, which often results in either too many false positives or too small true positives. In this article, we applied selection probability to multivariate regularization methods in order to identify pleiotropic variants associated with multiple phenotypes. Selection probability was applied to individual elastic-net, unified elastic-net and multi-response elastic-net regularization methods. In simulation studies, selection performance of three multivariate regularization methods was evaluated when the total number of phenotypes, the number of phenotypes associated with a variant, and correlations among phenotypes are different. We also applied the regularization methods to a wild bean dataset consisting of 169,028 variants and 17 phenotypes.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Section: Simulation Studiesmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Selection probability of multivariate regularization to identify pleiotropic variants in genetic association studies

Kim

Sun

2020

CSAM

Self Cite

View full text Add to dashboard Cite

show abstract

“…Feature selection methods and statistical learning methods such as sparse Group LASSO and network regularization have identified important CpGs in highly complex data. [29][30][31][32][33] More recent work has called for a greater understanding of the implications of DNAm-DNAm interactions through the incorporation of Gaussian Graphical Models, Canonical Correlation Analysis, and module discovery through weighted gene co-methylation networks. [34][35][36][37][38][39][40][41][42][43][44][45][46][47][48][49][50] There is growing support for the use of novel deep learning methods to aggregate, group, and select CpGs by their local context (e.g., genes) in an effort to connect and interpret the data with clinical outcomes.…”

Section: Introductionmentioning

confidence: 99%

MethylSPWNet and MethylCapsNet: Biologically Motivated Organization of DNAm Neural Network, Inspired by Capsule Networks

Levy

Chen

Azizgolshani

et al. 2020

Preprint

View full text Add to dashboard Cite

DNA methylation (DNAm) alterations are implicated with aging and diseases by regulating gene expression. DNAm deep-learning approaches can capture features associated with aging, cell type, and disease progression, but lack incorporation of prior biological knowledge. We present deep-learning software, MethylCapsNet and MethylSPWNet, that group CpGs into user-specified or predefined biologically relevant groupings (eg. gene promoter or CpG island context) related to diagnostic and prognostic outcomes. We train our models on a cohort (n=3,897) to classify central nervous system tumors and compare to existing approaches. Our methodology presents opportunities to increase interpretability of disease mechanisms through utilization of biologically relevant annotations.

show abstract

Group-shrinkage feature selection with a spatial network for mining DNA methylation data

Tang

Chang

et al. 2023

Computers in Biology and Medicine

View full text Add to dashboard Cite

New variable selection strategy for analysis of high-dimensional DNA methylation data

Cited by 10 publications

References 23 publications

Selection probability of multivariate regularization to identify pleiotropic variants in genetic association studies

Selection probability of multivariate regularization to identify pleiotropic variants in genetic association studies

MethylSPWNet and MethylCapsNet: Biologically Motivated Organization of DNAm Neural Network, Inspired by Capsule Networks

Group-shrinkage feature selection with a spatial network for mining DNA methylation data

Contact Info

Product

Resources

About