Evaluation of the power and type I error of recently proposed family-based tests of association for rare variants

Hainline, Allison E.; Álvarez, Carolina; Luedtke, Alexander R.; Greco, Brian; Beck, Andrew; Tintle, Nathan L.

doi:10.1186/1753-6561-8-s1-s36

Cited by 3 publications

(19 citation statements)

References 12 publications

(17 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…P corr is a function of the estimated kinship matrix (see Hainline et al [7] for details) and is used to adjust the standard error of the test statistics for the additional correlation contained in the pedigree structure.…”

Section: Methodsmentioning

confidence: 99%

“…Alternatively, Zhu and Xiong also consider a version that collapses rare variants below a threshold before applying the T 2 test (combined multivariate and collapsing [CMC]) or uses eigenvectors from the genotype matrix to reduce matrix dimensionality (functional principal component analysis [FPCA]; see Hainline et al [7] for details). In our implementation of CMC we used minor allele frequency cutoffs of 5% and 0.5%.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Application of family-based tests of association for rare variants to pathways

et al. 2014

Self Cite

View full text Add to dashboard Cite

Pathway analysis approaches for sequence data typically either operate in a single stage (all variants within all genes in the pathway are combined into a single, very large set of variants that can then be analyzed using standard "gene-based" test statistics) or in 2-stages (gene-based p values are computed for all genes in the pathway, and then the gene-based p values are combined into a single pathway p value). To date, little consideration has been given to the performance of gene-based tests (typically designed for a smaller number of single-nucleotide variants [SNVs]) when the number of SNVs in the gene or in the pathway is very large and the genotypes come from sequence data organized in large pedigrees. We consider recently proposed gene-based tests for rare variants from complex pedigrees that test for association between a large set of SNVs and a qualitative phenotype of interest (1-stage analyses) as well as 2-stage approaches. We find that many of these methods show inflated type I errors when the number of SNVs in the gene or the pathway is large (>200 SNVs) and when using standard approaches to estimate the genotype covariance matrix. Alternative methods are needed when testing very large sets of SNVs in 1-stage approaches.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Application of family-based tests of association for rare variants to pathways

et al. 2014

Self Cite

View full text Add to dashboard Cite

show abstract

“…They showed that familial effects on these test statistics can be written as a correction factor:

P_{corr} = \frac{n}{n_{c} (n - n_{c})} {((), D_{r} - \frac{n_{c}}{n} bold1)}^{T} Φ (D_{r} - \frac{n_{c}}{n} 1),

where n is the sample size, D r is a vector of size n indicating the disease status,

n_{c} = D_{r}^{T} 1

is the total number of cases, 1 is a vector of size n with all 1's, and Φ is the kinship matrix. Hainline et al [] compared performances of Zhu and Xiong's [] family‐based generalized T 2 test and the CMC test on the binary outcome HTN in real data. Originally, these tests did not allow adjusting for covariates.…”

Section: Methodsmentioning

confidence: 99%

“…Hainline et al [] used methods that do not adjust for any covariates, but all other contributions adjusted for age and some also adjusted for sex, smoking status, or both. Malzahn et al [] also adjusted for the interaction between age and sex.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Testing Genetic Association With Rare and Common Variants in Family Data

Chen

Malzahn

Balliu

et al. 2014

Genetic Epidemiology

View full text Add to dashboard Cite

With the advance of next-generation sequencing technologies in recent years, rare genetic variant data have now become available for genetic epidemiology studies. For family samples however, only a few statistical methods for association analysis of rare genetic variants have been developed. Rare variant approaches are of great interest particularly for family data because samples enriched for trait-relevant variants can be ascertained and rare variants are putatively enriched through segregation. To facilitate the evaluation of existing and new rare variant testing approaches for analyzing family data, Genetic Analysis Workshop 18 (GAW18) provided genotype and next-generation sequencing data and longitudinal blood pressure traits from extended pedigrees of Mexican-American families from the San Antonio Family Study. Our GAW18 group members analyzed real and simulated phenotype data from GAW18 by using generalized linear mixed-effects models or principal components to adjust for familial correlation or by testing binary traits using a correction factor for familial effects. With one exception, approaches dealt with the extended pedigrees in their original state using information based on the kinship matrix or alternative genetic similarity measures. For simulated data, our group demonstrated that the family-based kernel machine score test is superior in power to family-based single-marker or burden tests, except in a few specific scenarios. For real data, three contributions identified significant associations. They substantially reduced the number of tests before performing the association analysis. We conclude from our real data analyses that further development of strategies for targeted testing or more focused screening of genetic variants is strongly desirable.

show abstract

Value of Mendelian Laws of Segregation in Families: Data Quality Control, Imputation, and Beyond

Blue

Sun

Tintle

2014

Genetic Epidemiology

View full text Add to dashboard Cite

When analyzing family data, we dream of perfectly informative data, even whole genome sequences (WGS) for all family members. Reality intervenes, and we find next-generation sequence (NGS) data have error, and are often too expensive or impossible to collect on everyone. Genetic Analysis Workshop 18 groups “Quality Control” and “Dropping WGS through families using GWAS framework” focused on finding, correcting, and using errors within the available sequence and family data, developing methods to infer and analyze missing sequence data among relatives, and testing for linkage and association with simulated blood pressure. We found that single nucleotide polymorphisms, NGS, and imputed data are generally concordant, but that errors are particularly likely at rare variants, homozygous genotypes, within regions with repeated sequences or structural variants, and within sequence data imputed from unrelateds. Admixture complicated identification of cryptic relatedness, but information from Mendelian transmission improved error detection and provided an estimate of the de novo mutation rate. Both genotype and pedigree errors had an adverse effect on subsequent analyses. Computationally fast rules-based imputation was accurate, but could not cover as many loci or subjects as more computationally demanding probability-based methods. Incorporating population-level data into pedigree-based imputation methods improved results. Observed data outperformed imputed data in association testing, but imputed data were also useful. We discuss the strengths and weaknesses of existing methods, and suggest possible future directions. Topics include improving communication between those performing data collection and analysis, establishing thresholds for and improving imputation quality, and incorporating error into imputation and analytical models.

show abstract

Evaluation of the power and type I error of recently proposed family-based tests of association for rare variants

Cited by 3 publications

References 12 publications

Application of family-based tests of association for rare variants to pathways

Application of family-based tests of association for rare variants to pathways

Testing Genetic Association With Rare and Common Variants in Family Data

Value of Mendelian Laws of Segregation in Families: Data Quality Control, Imputation, and Beyond

Contact Info

Product

Resources

About