2012
DOI: 10.1002/gepi.21661
|View full text |Cite
|
Sign up to set email alerts
|

Exploring Data From Genetic Association Studies Using Bayesian Variable Selection and the Dirichlet Process: Application to Searching for Gene × Gene Patterns

Abstract: We construct data exploration tools for recognizing important covariate patterns associated with a phenotype, with particular focus on searching for association with gene-gene patterns. To this end, we propose a new variable selection procedure that employs latent selection weights and compare it to an alternative formulation. The selection procedures are implemented in tandem with a Dirichlet process mixture model for the flexible clustering of genetic and epidemiological profiles. We illustrate our approach … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
43
0

Year Published

2015
2015
2020
2020

Publication Types

Select...
7
2

Relationship

2
7

Authors

Journals

citations
Cited by 37 publications
(43 citation statements)
references
References 44 publications
0
43
0
Order By: Relevance
“…Among the other papers, DPM models with Gaussian kernels are used to cluster microarray gene expression data [28,29,30]. Our approach differs from the previous papers since SNP genotypes take only three possible values and thus we consider a multinomial mixture model [31,32]. It is worth mentioning that our goal is very similar to the one in [31], although with a different approach.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Among the other papers, DPM models with Gaussian kernels are used to cluster microarray gene expression data [28,29,30]. Our approach differs from the previous papers since SNP genotypes take only three possible values and thus we consider a multinomial mixture model [31,32]. It is worth mentioning that our goal is very similar to the one in [31], although with a different approach.…”
Section: Introductionmentioning
confidence: 99%
“…Our approach differs from the previous papers since SNP genotypes take only three possible values and thus we consider a multinomial mixture model [31,32]. It is worth mentioning that our goal is very similar to the one in [31], although with a different approach. They clustered individuals in groups (e.g., high risk, average risk and low risk for a certain disease) and then identified the covariates which were influent in clustering with DPM.…”
Section: Introductionmentioning
confidence: 99%
“…The variable selection options, which are comprised of either a binary [37] or continuous [57] selection weighting methods, allows the model to exclude an exposure from influencing the clustering procedure if an exposure exhibits a very low probability of being involved in the clustering patterns, further emphasizing a data-driven (non-parametric) approach to clustering. Specifically, we implemented variable selection with the “Continuous” option which utilizes a latent variable taking on values in (0,1) which informs the contribution of the variable in question in supporting a mixture distribution [53,57]. Using a Bayesian framework for variable selection has been shown to be particularly helpful within the context of a large number of correlated covariates because it appropriately handles model uncertainty [58,59].…”
Section: Methodsmentioning
confidence: 99%
“…A natural extension of such approaches would be to incorporate additional outcome data, i.e., to use a joint model of features and response in a semi-supervised manner, rather than proceed sequentially with clustering first, then by linking clusters with survival outcome as presented in the METABRIC paper. In the genetic epidemiology context, Papathomas et al (2012) used a joint clustering of genes and lung cancer outcomes to explore potential for gene-gene interactions. They adopt a non-parametric Bayesian approach referred to as profile regression (Molitor et al 2010), which also allows the selection of the important features that drive the clustering.…”
Section: Vertical Data Integrationmentioning
confidence: 99%