Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly ‘housekeeping’, whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.
This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.
The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.
Using deep sequencing (deepCAGE), the FANTOM4 study measured the genome-wide dynamics of transcription-start-site usage in the human monocytic cell line THP-1 throughout a time course of growth arrest and differentiation. Modeling the expression dynamics in terms of predicted cis-regulatory sites, we identified the key transcription regulators, their time-dependent activities and target genes. Systematic siRNA knockdown of 52 transcription factors confirmed the roles of individual factors in the regulatory network. Our results indicate that cellular states are constrained by complex networks involving both positive and negative regulatory interactions among substantial numbers of transcription factors and that no single transcription factor is both necessary and sufficient to drive the differentiation process.
We have identified a novel protein-disulfide isomerase and named it endothelial protein-disulfide isomerase (EndoPDI) because of its high expression in endothelial cells. Isolation of the full-length cDNA showed EndoPDI to be a 48 kDa protein that has three APWCGHC thioredoxin motifs in contrast to the two present in archetypal PDI. Ribonuclease protection and Western analysis has shown that hypoxia induces EndoPDI mRNA and protein expression. In situ hybridization analysis showed that EndoPDI expression is rare in normal tissues, except for keratinocytes of the hair bulb and syncytiotrophoblasts of the placenta, but was present in the endothelium of tumors and in other hypoxic lesions such as atherosclerotic plaques. We have compared the function of EndoPDI to that of PDI in endothelial cells using specific siRNA. PDI was shown to have a protective effect on endothelial cells under both normoxia and hypoxia. In contrast, EndoPDI has a protective effect only in endothelial cells exposed to hypoxia. The loss of EndoPDI expression under hypoxia caused a significant decrease in the secretion of adrenomedullin, endothelin-1, and CD105; molecules that protect endothelial cells from hypoxia-initiated apoptosis. The identification of an endothelial PDI further extends this increasing multigene family and EndoPDI, unlike archetypal PDI, may be a molecule with which to target tumor endothelium.Protein-disulfide isomerase (PDI) 1 is a ubiquitously expressed multifunctional protein found in the endoplasmic reticulum (ER). It constitutes around 0.8% of total cellular protein and can reach near millimolar concentrations in the ER lumen of some tissues. PDI plays a role in protein folding because of its ability to catalyze the formation of native disulfide bonds and disulfide bond rearrangement (1). Proteins targeted for secretion by the cell are inserted into and translocated across the ER membrane and enter the ER lumen in an unfolded state. PDI, together with a variety of other folding factors and molecular chaperones resident in the ER correctly fold the proteins ready for secretion (2). The accumulation of misfolded proteins in the ER, known as the Unfolded Protein Response, results in increased transcription of chaperones and folding catalysts. Proteins that fail to fold correctly are relocated to the cytosol for proteasomal degradation.PDI is a modular protein consisting of a, b, bЈ, aЈ, and c domains (3). The a and aЈ domains show sequence and structural homology to thioredoxin (Trx) and both contain the active site WCGHCK motif, constituting two independent catalytic sites for thiol-disulfide bond exchange reactions (4 -7). A ratelimiting step in the folding of many newly synthesized proteins is the formation of disulfide bridges (1) and the presence of WCGHCK in PDI is essential for this process, as confirmed by the loss of PDI activity following mutation of the cysteine residues within these motifs (5, 8). The b and bЈ domains also have the thioredoxin structural fold but lack the active site motif. Thus, PDI conta...
To examine the process by which duplicated genes diverge in function, we studied how the gene expression profiles of orthologous gene sets in human and mouse are affected by the presence of additional recent species-specific paralogs. Gene expression profiles were compared across 16 homologous tissues in human and mouse using microarray data from the Gene Expression Atlas for 1575 sets of orthologs including 250 with species-specific paralogs. We find that orthologs that have undergone recent duplication are less likely to have strongly correlated expression profiles than those that remain in a one-to-one relationship between human and mouse. There is a general trend for paralogous genes to become more specialized in their expression patterns, with decreased breadth and increased specificity of expression as gene family size increases. Despite this trend, detailed examination of some particular gene families where species-specific duplications have occurred indicated several examples of apparent neofunctionalization of duplicated genes, but only one case of subfunctionalization. Often, the expression of both copies of a duplicated gene appears to have changed relative to the ancestral state. Our results suggest that gene expression profiles are surprisingly labile and that expression in a particular tissue may be gained or lost repeatedly during the evolution of even small gene families. We conclude that gene duplication is a major driving force behind the emergence of divergent gene expression patterns
Genes that belong to the same functional pathways are often packaged into operons in prokaryotes. However, aside from examples in nematode genomes, this form of transcriptional regulation appears to be absent in eukaryotes. Nevertheless, a number of recent studies have shown that gene order in eukaryotic genomes is not completely random, and that genes with similar expression patterns tend to be clustered together. What remains unclear is whether co-expressed genes have been gathered together by natural selection to facilitate their regulation, or if the genes are co-expressed simply by virtue of their being close together in the genome. Here, we show that gene expression clusters tend to contain fewer chromosomal breakpoints between human and mouse than expected by chance, which indicates that they are being held together by natural selection. This conclusion applies to clusters defined on the basis of broad (housekeeping) expression, or on the basis of correlated transcription profiles across tissues. Contrary to previous reports, we find that genes with high expression are not clustered to a greater extent than expected by chance and are not conserved during evolution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.