Most high dimensional mediation epigenetic studies evaluate each mediator indirect effects independently. These approaches are underpowered to detect multiple mediators and the overall indirect effect remains poorly quantified. We developed HDMAX2, an efficient algorithm for high dimensional mediation analysis using max-squared tests, considering CpGs and aggregated mediator regions. HDMAX2 outperformed existing approaches in simulated and real data, detected true mediators with increased power, and assessed the overall indirect effect of several mediators from a high dimension matrix. We showed a polygenic architecture for placenta CpGs and regions mediating the effects of maternal smoking during pregnancy on babys birth weight.
Principal component analysis (PCA) is one of the most frequently-used approach to describe population structure from multilocus genotype data. Regarding geographic range expansions of modern humans, interpretations of PCA have, however, been questioned, as there is uncertainty about the wave-like patterns that have been observed in principal components. It has indeed been argued that wave-like patterns are mathematical artifacts that arise generally when PCA is applied to data in which genetic differentiation increases with geographic distance. Here, we present an alternative theory for the observation of wave-like patterns in PCA. We study a coalescent model – the umbrella model – for the diffusion of genetic variants. The model is based on a hierarchy of splits from an ancestral population without any particular geographical structure. In the umbrella model, splits occur almost continuously in time, giving birth to small daughter populations at a regular pace. Our results provide detailed mathematical descriptions of eigenvalues and eigenvectors for the PCA of sampled genomic sequences under the model. Removing variants uniquely represented in the sample, the PCA eigenvectors are defined as cosine functions of increasing periodicity, reproducing wave-like patterns observed in equilibrium isolation-by-distance models. Including rare variants in the analysis, the eigenvectors corresponding to the largest eigenvalues exhibit complex wave shapes. The accuracy of our predictions is further investigated with coalescent simulations. Our analysis supports the hypothesis that highly structured wave-like patterns could arise from genetic drift only, and may not always be artificial outcomes of spatially structured data. Genomic data related to the peopling of the Americas are reanalyzed in the light of our new theory.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.