Background
Altered microbiome composition and aberrant promoter hypermethylation of tumor suppressor genes (TSGs) are two important hallmarks of colorectal cancer (CRC). Here we performed concurrent 16S rRNA gene sequencing and methyl-CpG binding domain-based capture sequencing in 33 tissue biopsies (5 normal colonic mucosa tissues, 4 pairs of adenoma and adenoma-adjacent tissues, and 10 pairs of CRC and CRC-adjacent tissues) to identify significant associations between TSG promoter hypermethylation and CRC-associated bacteria, followed by functional validation of the methylation-associated bacteria.
Results
Fusobacterium nucleatum
and
Hungatella hathewayi
were identified as the top two methylation-regulating bacteria. Targeted analysis on
bona fide
TSGs revealed that
H. hathewayi
and
Streptococcus
spp
.
significantly correlated with
CDX2
and
MLH1
promoter hypermethylation, respectively. Mechanistic validation with cell-line and animal models revealed that
F. nucleatum
and
H. hathewayi
upregulated DNA methyltransferase.
H. hathewayi
inoculation also promoted colonic epithelial cell proliferation in germ-free and conventional mice.
Conclusion
Our integrative analysis revealed previously unknown epigenetic regulation of TSGs in host cells through inducing DNA methyltransferase by
F. nucleatum
and
H. hathewayi
, and established the latter as CRC-promoting bacteria.
BackgroundWith the increasing amount of high-throughput genomic sequencing data, there is a growing demand for a robust and flexible tool to perform interaction analysis. The identification of SNP-SNP, SNP-CpG, and higher order interactions helps explain the genetic etiology of human diseases, yet genome-wide analysis for interactions has been very challenging, due to the computational burden and a lack of statistical power in most datasets.ResultsThe wtest R package performs association testing for main effects, pairwise and high order interactions in genome-wide association study data, and cis-regulation of SNP and CpG sites in genome-wide and epigenome-wide data. The software includes a number of post-test diagnostic and analysis functions and offers an integrated toolset for genetic epistasis testing.ConclusionsThe wtest is an efficient and powerful statistical tool for integrated genetic epistasis testing. The package is available in CRAN: https://CRAN.R-project.org/package=wtest.
An increasing number of studies are focused on the epigenetic regulation of DNA to affect gene expression without modifications to the DNA sequence. Methylation plays an important role in shaping disease traits; however, previous studies were mainly experiment, based, resulting in few reports that measured gene–methylation interaction effects via statistical means. In this study, we applied the data set adaptive W-test to measure gene–methylation interactions. Performance was evaluated by the ability to detect a given set of causal markers in the data set obtained from the GAW20. Results from simulation data analyses showed that the W-test was able to detect most markers. The method was also applied to chromosome 11 of the experimental data set and identified clusters of genes with neuronal and retinal functions, including MPPED2I, GUCY2E, NAV2, and ZBTB16. Genes from the TRIM family were also identified; these genes are potentially related to the regulation of triglyceride levels. Our results suggest that the W-test could be an efficient and effective method to detect gene–methylation interactions. Furthermore, the identified genes suggest an interesting relationship between lipid levels and the etiology of neurological disorders.
Genetic data consists of a wide range of marker types, including common,
low frequency, and rare variants. Multiple genetic markers and their
interactions play central roles in the heritability of complex disease. In this
study, we propose an algorithm that uses a stratified variable selection design
by genetic architectures and interaction effects, achieved by a data-set
adaptive W-test. The polygenic sets in all strata were integrated to form a
classification rule. The algorithm was applied to the Critical Assessment of
Genome Interpretation 4 bipolar challenge sequencing data. The prediction
accuracy was 60% using genetic markers on an independent test set. We
found that epistasis among common genetic variants contributed most
substantially to prediction precision. However, the sample size was not large
enough to draw conclusions for the lack of predictability of low frequency
variants and their epistasis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.