Exact variance component tests for longitudinal microbiome studies

Zhai, Jing; Knox, Kenneth S.; Twigg, Homer L.; Zhou, Hua; Zhou, Jin

doi:10.1002/gepi.22185

Cited by 7 publications

(6 citation statements)

References 59 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Although under regularity conditions, and are consistent estimators for variance component parameters and under the null hypothesis , the classical score-type variance component test above treats them as fixed numbers and ignores the variability in their estimation, which could result in not well-calibrated p values in finite samples. This is a known issue for score-type variance component tests in microbiome association studies with small sample sizes [ 48 – 50 ]. Despite large sample sizes in biobank-scale cohorts, the local IBD matrix Ψ l for genetic locus l is often sparse, which could invalidate asymptotic inference on the quadratic form .…”

Section: Description Of the Methodsmentioning

confidence: 99%

FiMAP: A fast identity-by-descent mapping test for biobank-scale cohorts

Chen,

Naseri,

Zhi

2023

PLoS Genet

View full text Add to dashboard Cite

Although genome-wide association studies (GWAS) have identified tens of thousands of genetic loci, the genetic architecture is still not fully understood for many complex traits. Most GWAS and sequencing association studies have focused on single nucleotide polymorphisms or copy number variations, including common and rare genetic variants. However, phased haplotype information is often ignored in GWAS or variant set tests for rare variants. Here we leverage the identity-by-descent (IBD) segments inferred from a random projection-based IBD detection algorithm in the mapping of genetic associations with complex traits, to develop a computationally efficient statistical test for IBD mapping in biobank-scale cohorts. We used sparse linear algebra and random matrix algorithms to speed up the computation, and a genome-wide IBD mapping scan of more than 400,000 samples finished within a few hours. Simulation studies showed that our new method had well-controlled type I error rates under the null hypothesis of no genetic association in large biobank-scale cohorts, and outperformed traditional GWAS single-variant tests when the causal variants were untyped and rare, or in the presence of haplotype effects. We also applied our method to IBD mapping of six anthropometric traits using the UK Biobank data and identified a total of 3,442 associations, 2,131 (62%) of which remained significant after conditioning on suggestive tag variants in the ± 3 centimorgan flanking regions from GWAS.

show abstract

Section: Description Of the Methodsmentioning

confidence: 99%

FiMAP: A fast identity-by-descent mapping test for biobank-scale cohorts

Chen,

Naseri,

Zhi

2023

PLoS Genet

View full text Add to dashboard Cite

show abstract

“…This is a known issue for scoretype variance component tests in microbiome association studies with small sample sizes. [43][44][45] Despite large sample sizes in biobank-scale cohorts, the local IBD matrix 𝛹 𝑙 for genetic locus 𝑙 is often sparse, which could invalidate asymptotic inference on the quadratic form 𝑄 𝑙 =…”

Section: Variance Component Modelsmentioning

confidence: 99%

FiMAP: A Fast Identity-by-Descent Mapping Test for Biobank-scale Cohorts

Chen

Naseri

Zhi

2021

Preprint

View full text Add to dashboard Cite

Although genome-wide association studies (GWAS) have identified tens of thousands of genetic loci, the genetic architecture is still not fully understood for many complex traits. Most GWAS and sequencing association studies have focused on single nucleotide polymorphisms or copy number variations, including common and rare genetic variants. However, phased haplotype information is often ignored in GWAS or variant set tests for rare variants. Here we leverage the identity-by-descent (IBD) segments inferred from a random projection-based IBD detection algorithm in the mapping of genetic associations with complex traits, to develop a computationally efficient statistical test for IBD mapping in biobank-scale cohorts. We used sparse linear algebra and random matrix algorithms to speed up the computation, and a genome-wide IBD mapping scan of more than 400,000 samples finished within a few hours. Simulation studies showed that our new method had well-controlled type I error rates under the null hypothesis of no genetic association in large biobank-scale cohorts, and outperformed traditional GWAS approaches and variant set tests when the causal variants were untyped and rare, or in the presence of haplotype effects. We also applied our method to IBD mapping of six anthropometric traits using the UK Biobank data and identified a 4 cM region on chromosome 8 associated with multiple traits related to body fat distribution or weight.

show abstract

“…However, inference on Statistica Sinica: Newly accepted Paper (accepted author-version subject to English editing) the variance components is less studied and often requires strong distributional assumptions on the random effects and the error terms. When the underlying distributions are assumed to be multivariate normal, classical inference methods, such as the likelihood ratio test, the restricted likelihood ratio test, and the score test (Self and Liang, 1987;Zhang and Lin, 2003;Koh et al, 2019;Zhai et al, 2019), can be applied. However, these parametric methods are often restrictive and not robust if the model assumptions are violated.…”

Section: Introductionmentioning

confidence: 99%

“…The linear structure holds when each components of D(θ * ) is a linear function of θ * (Lin, 1997). This encompasses both nested, crossed and clustered designs (Michalski and Zmyślony, 1996;Zhai et al, 2019;Chen et al, 2019;Li et al, 2021). See Section 5.1 for a specific example of such a random-effect model for modeling the family data that includes additive genetic effect, common environment and unique subject-specific We first introduce some notation.…”

Section: Introductionmentioning

confidence: 99%

Empirical Likelihood Inference of Variance Components in Linear Mixed-Effects Models

Zhang¹,

Guo²,

Carpenter³

et al. 2025

STAT SINICA

View full text Add to dashboard Cite

Linear mixed-effects models are widely used in analyzing repeated measures data, including clustered and longitudinal data, where inferences of both fixed effects and variance components are of interest. Unlike inference on fixed effect, which has been well studied, inference on the variance components is more challenging due to null value on the boundary and the unknown fixed effects as nuisance parameters. Existing methods require strong distributional assumptions

show abstract

Exact variance component tests for longitudinal microbiome studies

Cited by 7 publications

References 59 publications

FiMAP: A fast identity-by-descent mapping test for biobank-scale cohorts

FiMAP: A fast identity-by-descent mapping test for biobank-scale cohorts

FiMAP: A Fast Identity-by-Descent Mapping Test for Biobank-scale Cohorts

Empirical Likelihood Inference of Variance Components in Linear Mixed-Effects Models

Contact Info

Product

Resources

About