Motivation: For the analysis of rare variants in sequence data, numerous approaches have been suggested. Fixed and flexible threshold approaches collapse the rare variant information of a genomic region into a test statistic with reduced dimensionality. Alternatively, the rare variant information can be combined in statistical frameworks that are based on suitable regression models, machine learning, etc. Although the existing approaches provide powerful tests that can incorporate information on allele frequencies and prior biological knowledge, differences in the spatial clustering of rare variants between cases and controls cannot be incorporated. Based on the assumption that deleterious variants and protective variants cluster or occur in different parts of the genomic region of interest, we propose a testing strategy for rare variants that builds on spatial cluster methodology and that guides the identification of the biological relevant segments of the region. Our approach does not require any assumption about the directions of the genetic effects.Results: In simulation studies, we assess the power of the clustering approach and compare it with existing methodology. Our simulation results suggest that the clustering approach for rare variants is well powered, even in situations that are ideal for standard methods. The efficiency of our spatial clustering approach is not affected by the presence of rare variants that have opposite effect size directions. An application to a sequencing study for non-syndromic cleft lip with or without cleft palate (NSCL/P) demonstrates its practical relevance. The proposed testing strategy is applied to a genomic region on chromosome 15q13.3 that was implicated in NSCL/P etiology in a previous genome-wide association study, and its results are compared with standard approaches.Availability: Source code and documentation for the implementation in R will be provided online. Currently, the R-implementation only supports genotype data. We currently are working on an extension for VCF files.Contact: heide.fier@googlemail.com
The determination of potential sibship is a common task in routine kinship analysis, but often the putative parents are not available for analysis anymore. Then, a sibling analysis has to be conducted investigating only the potential siblings, thus reducing the power of the conclusion. In an attempt to determine meaningfulness of biostatistical calculations, 346 dizygotic twin pairs, 30 confirmed half siblings, and 112 unrelated people (to generate 6216 pair comparisons) were studied, all genetically typed using at least the Powerplex® 16 STRs. From every pair, the probabilities for a full sibship (identical parents) and half sibship (different fathers) were calculated using a commercially available computer program. Additionally, we simulated marker data for one million pairs of full sibs, half sibs, and unrelated persons each. Ninety-five percent of full sibling pairs demonstrated a likelihood ratio (LR) > 9 (W-value > 90 %) and less than 4% of these showed a LR < 3 (W-value < 75%) for full sibship after analysis of 15 STRs. The results for half siblings are less unambiguous. Here, only 57% achieved a LR > 9 and 23% a LR < 3. Regarding the unrelated pairs, more than 90% had a LR < 1/9 and only 2% reached a LR > 9. All in all, our results show that 15 to 20 STRs have sufficient power for analyses in kinship. Moreover, our data provide a statistical basis for the determination of the information content of a LR/W-value in a sibship case. Investigating an identical number of full siblings and unrelated pairs, it could be shown that 92% of pairs with a LR > 9 for full sibship probability really are full siblings. So, setting a cutoff level for full sibship at LR > 9, less than 10% of pairs will be wrongly assumed as full siblings even though they are unrelated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.