Many human traits are highly correlated. This correlation can be leveraged to improve the power of genetic association tests to identify markers associated with one or more of the traits. Principal component analysis (PCA) is a useful tool that has been widely used for the multivariate analysis of correlated variables. PCA is usually applied as a dimension reduction method: the few top principal components (PCs) explaining most of total trait variance are tested for association with a predictor of interest, and the remaining components are not analyzed. In this study we review the theoretical basis of PCA and describe the behavior of PCA when testing for association between a SNP and correlated traits. We then use simulation to compare the power of various PCA-based strategies when analyzing up to 100 correlated traits. We show that contrary to widespread practice, testing only the top PCs often has low power, whereas combining signal across all PCs can have greater power. This power gain is primarily due to increased power to detect genetic variants with opposite effects on positively correlated traits and variants that are exclusively associated with a single trait. Relative to other methods, the combined-PC approach has close to optimal power in all scenarios considered while offering more flexibility and more robustness to potential confounders. Finally, we apply the proposed PCA strategy to the genome-wide association study of five correlated coagulation traits where we identify two candidate SNPs that were not found by the standard approach.
BackgroundVenous Thrombosis (VT) is a common multifactorial disease associated with a major public health burden. Genetics factors are known to contribute to the susceptibility of the disease but how many genes are involved and their contribution to VT risk still remain obscure. We aimed to identify genetic variants associated with VT risk.Methodology/Principal FindingsWe conducted a genome-wide association study (GWAS) based on 551,141 SNPs genotyped in 1,542 cases and 1,110 controls. Twelve SNPs reached the genome-wide significance level of 2.0×10−8 and encompassed four known VT-associated loci, ABO, F5, F11 and FGG. By means of haplotype analyses, we also provided novel arguments in favor of a role of HIVEP1, PROCR and STAB2, three loci recently hypothesized to participate in the susceptibility to VT. However, no novel VT-associated loci came out of our GWAS. Using a recently proposed statistical methodology, we also showed that common variants could explain about 35% of the genetic variance underlying VT susceptibility among which 3% could be attributable to the main identified VT loci. This analysis additionally suggested that the common variants left to be identified are not uniformly distributed across the genome and that chromosome 20, itself, could contribute to ∼7% of the total genetic variance.Conclusions/SignificanceThis study might also provide a valuable source of information to expand our understanding of biological mechanisms regulating quantitative biomarkers for VT.
Dickeya species are soft rot disease-causing bacterial plant pathogens and an emerging agricultural threat in Europe. Environmental modulation of gene expression is critical for Dickeya dadantii pathogenesis. While the bacterium uses various environmental cues to distinguish between its habitats, an intricate transcriptional control system coordinating the expression of virulence genes ensures efficient infection. Understanding of this behaviour requires a detailed knowledge of expression patterns under a wide range of environmental conditions, which is currently lacking. To obtain a comprehensive picture of this adaptive response, we devised a strategy to examine the D. dadantii transcriptome in a series of 32 infection-relevant conditions encountered in the hosts. We propose a temporal map of the bacterial response to various stress conditions and show that D. dadantii elicits complex genetic behaviour combining common stress-response genes with distinct sets of genes specifically induced under each particular stress. Comparison of our dataset with an in planta expression profile reveals the combined impact of stress factors and enables us to predict the major stress confronting D. dadantii at a particular stage of infection. We provide a comprehensive catalog of D. dadantii genomic responses to environmentally relevant stimuli, thus facilitating future studies of this important plant pathogen.
BackgroundVenous Thrombosis (VT) is a common multifactorial disease with an estimated heritability between 35% and 60%. Known genetic polymorphisms identified so far only explain ~5% of the genetic variance of the disease. This study was aimed to investigate whether pair-wise interactions between common single nucleotide polymorphisms (SNPs) could exist and modulate the risk of VT.MethodsA genome-wide SNP x SNP interaction analysis on VT risk was conducted in a French case–control study and the most significant findings were tested for replication in a second independent French case–control sample. The results obtained in the two studies totaling 1,953 cases and 2,338 healthy subjects were combined into a meta-analysis.ResultsThe smallest observed p-value for interaction was p = 6.00 10-11 but it did not pass the Bonferroni significance threshold of 1.69 10-12 correcting for the number of investigated interactions that was 2.96 1010. Among the 37 suggestive pair-wise interactions with p-value less than 10-8, one was further shown to involve two SNPs, rs9804128 (IGFS21 locus) and rs4784379 (IRX3 locus) that demonstrated significant interactive effects (p = 4.83 10-5) on the variability of plasma Factor VIII levels, a quantitative biomarker of VT risk, in a sample of 1,091 VT patients.ConclusionThis study, the first genome-wide SNP interaction analysis conducted so far on VT risk, suggests that common SNPs are unlikely exerting strong interactive effects on the risk of disease.
Cowpea [Vigna unguiculata (L.) Walp] is one of the important climate-resilient legume crops for food and nutrition security in sub-Saharan Africa. Ethiopia is believed to harbor high cowpea genetic diversity, but this has not yet been efficiently characterized and exploited in breeding. The objective of this study was to evaluate the extent and pattern of genetic diversity in 357 cowpea accestions comprising landraces (87%), breeding lines (11%) and released varieties (2%), using single nucleotide polymorphism markers. The overall gene diversity and heterozygosity were 0.28 and 0.12, respectively. The genetic diversity indices indicated substantial diversity in Ethiopian cowpea landraces. Analysis of molecular variance showed that most of the variation was within in the population (46%) and 44% between individuals, with only 10% of the variation being among populations. Model-based ancestry analysis, the phylogenetic tree, discriminant analysis of principal components and principal coordinate analysis classified the 357 genotypes into three well-differentiated genetic populations. Genotypes from the same region grouped into different clusters, while others from different regions fell into the same cluster. This indicates that differences in regions of origin may not be the main driver determining the genetic diversity in cowpea in Ethiopia. Therefore, differences in sources of origin, as currently distributed in Ethiopia, should not necessarily be used as indices of genetic diversity. Choice of parental lines should rather be based on a systematic assessment of genetic diversity in a specific population. The study also suggested 94 accesstions as core collection which retained 100% of the genetic diversity from the entire collection. This core set represents 26% of the entire collection pinpointing a wide distribution of the diversity within the ethiopian landraces. The outcome of this study provided new insights into the genetic diversity and population structure in Ethiopian cowpea genetic resources for designing effective collection and conservation strategies for efficient utilization in breeding.
We aimed to assess whether pri-miRNA SNPs (miSNPs) could influence monocyte gene expression, either through marginal association or by interacting with polymorphisms located in 3'UTR regions (3utrSNPs). We then conducted a genome-wide search for marginal miSNPs effects and pairwise miSNPs × 3utrSNPs interactions in a sample of 1,467 individuals for which genome-wide monocyte expression and genotype data were available. Statistical associations that survived multiple testing correction were tested for replication in an independent sample of 758 individuals with both monocyte gene expression and genotype data. In both studies, the hsa-mir-1279 rs1463335 was found to modulate in cis the expression of LYZ and in trans the expression of CNTN6, CTRC, COPZ2, KRT9, LRRFIP1, NOD1, PCDHA6, ST5 and TRAF3IP2 genes, supporting the role of hsa-mir-1279 as a regulator of several genes in monocytes. In addition, we identified two robust miSNPs × 3utrSNPs interactions, one involving HLA-DPB1 rs1042448 and hsa-mir-219-1 rs107822, the second the H1F0 rs1894644 and hsa-mir-659 rs5750504, modulating the expression of the associated genes.As some of the aforementioned genes have previously been reported to reside at disease-associated loci, our findings provide novel arguments supporting the hypothesis that the genetic variability of miRNAs could also contribute to the susceptibility to human diseases.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.