Advances in biological and medical technologies have been providing us explosive volumes of biological and physiological data, such as medical images, electroencephalography, genomic and protein sequences. Learning from these data facilitates the understanding of human health and disease. Developed from artificial neural networks, deep learning-based algorithms show great promise in extracting features and learning patterns from complex data. The aim of this paper is to provide an overview of deep learning techniques and some of the state-of-the-art applications in the biomedical field. We first introduce the development of artificial neural network and deep learning. We then describe two main components of deep learning, i.e., deep learning architectures and model optimization. Subsequently, some examples are demonstrated for deep learning applications, including medical image classification, genomic sequence analysis, as well as protein structure classification and prediction. Finally, we offer our perspectives for the future directions in the field of deep learning.
Chromatin insulators are DNA elements that regulate the level of gene expression either by preventing gene silencing through the maintenance of heterochromatin boundaries or by preventing gene activation by blocking interactions between enhancers and promoters. CCCTC-binding factor (CTCF), a ubiquitously expressed 11-zinc-finger DNA-binding protein, is the only protein implicated in the establishment of insulators in vertebrates. While CTCF has been implicated in diverse regulatory functions, CTCF has only been studied in a limited number of cell types across human genome. Thus, it is not clear whether the identified cell type-specific differences in CTCF-binding sites are functionally significant. Here, we identify and characterize cell type-specific and ubiquitous CTCF-binding sites in the human genome across 38 cell types designated by the Encyclopedia of DNA Elements (ENCODE) consortium. These cell type-specific and ubiquitous CTCF-binding sites show uniquely versatile transcriptional functions and characteristic chromatin features. In addition, we confirm the insulator barrier function of CTCF-binding and explore the novel function of CTCF in DNA replication. These results represent a critical step toward the comprehensive and systematic understanding of CTCF-dependent insulators and their versatile roles in the human genome.
Maternal obesity can impair embryo development and offspring health, yet the mechanisms responsible remain poorly understood. In a high-fat diet (HFD)-based female mouse model of obesity, we identified a marked reduction of Stella (also known as DPPA3 or PGC7) protein in oocytes. Starting with this clue, we found that the establishment of pronuclear epigenetic asymmetry in zygotes from obese mice was severely disrupted, inducing the accumulation of maternal 5-hydroxymethylcytosine modifications and DNA lesions. Furthermore, methylome-wide sequencing analysis detected global hypomethylation across the zygote genome in HFD-fed mice, with a specific enrichment in transposon elements and unique regions. Notably, overexpression of Stella in the oocytes of HFD-fed mice not only restored the epigenetic remodeling in zygotes but also partly ameliorated the maternal-obesity-associated developmental defects in early embryos and fetal growth. Thus, Stella insufficiency in oocytes may represent a critical mechanism that mediates the phenotypic effects of maternal obesity in embryos and offspring.
To understand the molecular mechanisms that underlie global transcriptional regulation, it is essential to first identify all the transcriptional regulatory elements in the human genome. The advent of next-generation sequencing has provided a powerful platform for genome-wide analysis of different species and specific cell types; when combined with traditional techniques to identify regions of open chromatin [DNaseI hypersensitivity (DHS)] or specific binding locations of transcription factors [chromatin immunoprecipitation (ChIP)], and expression data from microarrays, we become uniquely poised to uncover the mysteries of the genome and its regulation. To this end, we have performed global meta-analysis of the relationship among data from DNaseI-seq, ChIP-seq and expression arrays, and found that specific correlations exist among regulatory elements and gene expression across different cell types. These correlations revealed four distinct modes of chromatin domain structure reflecting different functions: repressive, active, primed and bivalent. Furthermore, CCCTC-binding factor (CTCF) binding sites were identified based on these integrative data. Our findings uncovered a complex regulatory process involving by DNaseI HS sites and histone modifications, and suggest that these dynamic elements may be responsible for maintaining chromatin structure and integrity of the human genome. Our integrative approach provides an example by which data from diverse technology platforms may be integrated to provide more meaningful insights into global transcriptional regulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.