Extracting inherent valuable knowledge from omics big data remains as a daunting problem in bioinformatics and computational biology. Deep learning, as an emerging branch from machine learning, has exhibited unprecedented performance in quite a few applications from academia and industry. We highlight the difference and similarity in widely utilized models in deep learning studies, through discussing their basic structures, and reviewing diverse applications and disadvantages. We anticipate the work can serve as a meaningful perspective for further development of its theory, algorithm and application in bioinformatic and computational biology.
Despite numerous studies done on understanding the role of DNA methylation, limited work has focused on systems integration of cell type-specific interplay between DNA methylation and gene transcription. Through a genome-wide analysis of DNA methylation across 19 cell types with T-47D as reference, we identified 106,252 cell type-specific differentially-methylated CpGs categorized into 7,537 differentially (46.6% hyper- and 53.4% hypo-) methylated regions. We found 44% promoter regions and 75% CpG islands were T-47D cell type-specific methylated. Pyrosequencing experiments validated the cell type-specific methylation across three benchmark cell lines. Interestingly, these DMRs overlapped with 1,145 known tumor suppressor genes. We then developed a Bayesian Gaussian Regression model to measure the relationship among DNA methylation, genomic segment distribution, differential gene expression and tumor suppressor gene status. The model uncovered that 3′UTR methylation has much less impact on transcriptional activity than other regions. Integration of DNA methylation and 82 transcription factor binding information across the 19 cell types suggested diverse interplay patterns between the two regulators. Our integrative analysis reveals cell type-specific and genomic region-dependent regulatory patterns and provides a perspective for integrating hundreds of various omics-seq data together.
Deregulation of the transforming growth factor-β (TGFβ) signaling pathway in epithelial ovarian cancer has been reported, but the precise mechanism underlying disrupted TGFβ signaling in the disease remains unclear. We performed chromatin immunoprecipitation followed by sequencing (ChIP-seq) to investigate genome-wide screening of TGFβ-induced SMAD4 binding in epithelial ovarian cancer. Following TGFβ stimulation of the A2780 epithelial ovarian cancer cell line, we identified 2,362 SMAD4 binding loci and 318 differentially expressed SMAD4 target genes. Comprehensive examination of SMAD4-bound loci, revealed four distinct binding patterns: 1) Basal; 2) Shift; 3) Stimulated Only; 4) Unstimulated Only. TGFβ stimulated SMAD4-bound loci were primarily classified as either Stimulated only (74%) or Shift (25%), indicating that TGFβ-stimulation alters SMAD4 binding patterns in epithelial ovarian cancer cells. Furthermore, based on gene regulatory network analysis, we determined that the TGFβ-induced, SMAD4-dependent regulatory network was strikingly different in ovarian cancer compared to normal cells. Importantly, the TGFβ/SMAD4 target genes identified in the A2780 epithelial ovarian cancer cell line were predictive of patient survival, based on in silico mining of publically available patient data bases. In conclusion, our data highlight the utility of next generation sequencing technology to identify genome-wide SMAD4 target genes in epithelial ovarian cancer and link aberrant TGFβ/SMAD signaling to ovarian tumorigenesis. Furthermore, the identified SMAD4 binding loci, combined with gene expression profiling and in silico data mining of patient cohorts, may provide a powerful approach to determine potential gene signatures with biological and future translational research in ovarian and other cancers.
Recent genome-wide profiling reveals highly complex regulation networks among ERα and its targets. We integrated estrogen (E2)-stimulated time-series ERα ChIP-seq and gene expression data to identify the ERα-centered transcription factor (TF) hubs and their target genes, and inferred the time-variant hierarchical network structures using a Bayesian multivariate modeling approach. With its recurrent motif patterns, we determined three embedded regulatory modules from the ERα core transcriptional network. The GO analyses revealed the distinct biological function associated with each of three embedded modules. The survival analysis showed the genes in each module were able to render a significant survival correlation in breast cancer patient cohorts. In summary, our Bayesian statistical modeling and modularity analysis not only reveals the dynamic properties of the ERα-centered regulatory network and associated distinct biological functions, but also provides a reliable and effective genomic analytical approach for the analysis of dynamic regulatory network for any given TF.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.