In embryonic stem (ES) cells, developmental regulators have a characteristic bivalent chromatin signature marked by simultaneous presence of both activation (H3K4me3) and repression (H3K27me3) signals and are thought to be in a ‘poised’ state for subsequent activation or silencing during differentiation. We collected eleven pairs (H3K4me3 and H3K27me3) of ChIP sequencing datasets in human ES cells and eight pairs in murine ES cells, and predicted high-confidence (HC) bivalent promoters. Over 85% of H3K27me3 marked promoters were bivalent in human and mouse ES cells. We found that (i) HC bivalent promoters were enriched for developmental factors and were highly likely to be differentially expressed upon transcription factor perturbation; (ii) murine HC bivalent promoters were occupied by both polycomb repressive component classes (PRC1 and PRC2) and grouped into four distinct clusters with different biological functions; (iii) HC bivalent and active promoters were CpG rich while H3K27me3-only promoters lacked CpG islands. Binding enrichment of distinct sets of regulators distinguished bivalent from active promoters. Moreover, a ‘TCCCC’ sequence motif was specifically enriched in bivalent promoters. Finally, this analysis will serve as a resource for future studies to further understand transcriptional regulation during embryonic development.
BackgroundGene expression heterogeneity contributes to development as well as disease progression. Due to technological limitations, most studies to date have focused on differences in mean expression across experimental conditions, rather than differences in gene expression variance. The advent of single cell RNA sequencing has now made it feasible to study gene expression heterogeneity and to characterise genes based on their coefficient of variation.MethodsWe collected single cell gene expression profiles for 32 human and 39 mouse embryonic stem cells and studied correlation between diverse characteristics such as network connectivity and coefficient of variation (CV) across single cells. We further systematically characterised properties unique to High CV genes.ResultsHighly expressed genes tended to have a low CV and were enriched for cell cycle genes. In contrast, High CV genes were co-expressed with other High CV genes, were enriched for bivalent (H3K4me3 and H3K27me3) marked promoters and showed enrichment for response to DNA damage and DNA repair.ConclusionsTaken together, this analysis demonstrates the divergent characteristics of genes based on their CV. High CV genes tend to form co-expression clusters and they explain bivalency at least in part.
Single cell transcriptomics is becoming a common technique to unravel new biological phenomena whose functional significance can only be understood in the light of differences in gene expression between single cells. The technology is still in its early days and therefore suffers from many technical challenges. This review discusses the continuous effort to identify and systematically characterise various sources of technical variability in single cell expression data and the need to further develop experimental and computational tools and resources to help deal with it.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.