BackgroundTraditional Chinese Medicine (TCM) is characterized by the wide use of herbal formulae, which are capable of systematically treating diseases determined by interactions among various herbs. However, the combination rule of TCM herbal formulae remains a mystery due to the lack of appropriate methods.MethodsFrom a network perspective, we established a method called Distance-based Mutual Information Model (DMIM) to identify useful relationships among herbs in numerous herbal formulae. DMIM combines mutual information entropy and “between-herb-distance” to score herb interactions and construct herb network. To evaluate the efficacy of the DMIM-extracted herb network, we conducted in vitro assays to measure the activities of strongly connected herbs and herb pairs. Moreover, using the networked Liu-wei-di-huang (LWDH) formula as an example, we proposed a novel concept of “co-module” across herb-biomolecule-disease multilayer networks to explore the potential combination mechanism of herbal formulae.ResultsDMIM, when used for retrieving herb pairs, achieves a good balance among the herb’s frequency, independence, and distance in herbal formulae. A herb network constructed by DMIM from 3865 Collaterals-related herbal formulae can not only nicely recover traditionally-defined herb pairs and formulae, but also generate novel anti-angiogenic herb ingredients (e.g. Vitexicarpin with IC50=3.2 μM, and Timosaponin A-III with IC50=3.4 μM) as well as herb pairs with synergistic or antagonistic effects. Based on gene and phenotype information associated with both LWDH herbs and LWDH-treated diseases, we found that LWDH-treated diseases show high phenotype similarity and identified certain “co-modules” enriched in cancer pathways and neuro-endocrine-immune pathways, which may be responsible for the action of treating different diseases by the same LWDH formula.ConclusionsDMIM is a powerful method to identify the combination rule of herbal formulae and lead to new discoveries. We also provide the first evidence that the co-module across multilayer networks may underlie the combination mechanism of herbal formulae and demonstrate the potential of network biology approaches in the studies of TCM.
The advent of large-scale microbiome studies affords newfound analytical opportunities to understand how these communities of microbes operate and relate to their environment. However, the analytical methodology needed to model microbiome data and integrate them with other data constructs remains nascent. This emergent analytical toolset frequently ports over techniques developed in other multi-omics investigations, especially the growing array of statistical and computational techniques for integrating and representing data through networks. While network analysis has emerged as a powerful approach to modeling microbiome data, oftentimes by integrating these data with other types of omics data to discern their functional linkages, it is not always evident if the statistical details of the approach being applied are consistent with the assumptions of microbiome data or how they impact data interpretation. In this review, we overview some of the most important network methods for integrative analysis, with an emphasis on methods that have been applied or have great potential to be applied to the analysis of multi-omics integration of microbiome data. We compare advantages and disadvantages of various statistical tools, assess their applicability to microbiome data, and discuss their biological interpretability. We also highlight on-going statistical challenges and opportunities for integrative network analysis of microbiome data.
In genetic association testing, failure to properly control for population structure can lead to severely inflated type 1 error and power loss. Meanwhile, adjustment for relevant covariates is often desirable and sometimes necessary to protect against spurious association and to improve power. Many recent methods to account for population structure and covariates are based on linear mixed models (LMMs), which are primarily designed for quantitative traits. For binary traits, however, LMM is a misspecified model and can lead to deteriorated performance. We propose CARAT, a binary-trait association testing approach based on a mixed-effects quasi-likelihood framework, which exploits the dichotomous nature of the trait and achieves computational efficiency through estimating equations. We show in simulation studies that CARAT consistently outperforms existing methods and maintains high power in a wide range of population structure settings and trait models. Furthermore, CARAT is based on a retrospective approach, which is robust to misspecification of the phenotype model. We apply our approach to a genome-wide analysis of Crohn disease, in which we replicate association with 17 previously identified regions. Moreover, our analysis on 5p13.1, an extensively reported region of association, shows evidence for the presence of multiple independent association signals in the region. This example shows how CARAT can leverage known disease risk factors to shed light on the genetic architecture of complex traits.
In flowering plants, gene expression in the haploid male gametophyte (pollen) is essential for sperm delivery and double fertilization. Pollen also undergoes dynamic epigenetic regulation of expression from transposable elements (TEs), but how this process interacts with gene expression is not clearly understood. To explore relationships among these processes, we quantified transcript levels in four male reproductive stages of maize (tassel primordia, microspores, mature pollen, and sperm cells) via RNA-seq. We found that, in contrast with vegetative cell-limited TE expression in Arabidopsis pollen, TE transcripts in maize accumulate as early as the microspore stage and are also present in sperm cells. Intriguingly, coordinate expression was observed between highly expressed protein-coding genes and their neighboring TEs, specifically in mature pollen and sperm cells. To investigate a potential relationship between elevated gene transcript level and pollen function, we measured the fitness cost (male-specific transmission defect) of GFP-tagged coding sequence insertion mutations in over 50 genes identified as highly expressed in the pollen vegetative cell, sperm cell, or seedling (as a sporophytic control). Insertions in seedling genes or sperm cell genes (with one exception) exhibited no difference from the expected 1:1 transmission ratio. In contrast, insertions in over 20% of vegetative cell genes were associated with significant reductions in fitness, showing a positive correlation of transcript level with non-Mendelian segregation when mutant. Insertions in maize gamete expressed2 (Zm gex2), the sole sperm cell gene with measured contributions to fitness, also triggered seed defects when crossed as a male, indicating a conserved role in double fertilization, given the similar phenotype previously demonstrated for the Arabidopsis ortholog GEX2.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.