The human transcriptome across tissues and individuals

Melé, Marta; Ferreira, Pedro G.; Reverter, Ferrán; DeLuca, David S.; Monlong, Jean; Sammeth, Michael; Young, Taylor; Goldmann, Jakob M.; Pervouchine, Dmitri D.; Sullivan, Timothy J.; Johnson, Rory; Segrè, Ayellet V.; Djebali, Sarah; Niarchou, Anastasia; Wright, Fred A.; Lappalainen, Tuuli; Calvo, Miquel; Getz, Gad; Dermitzakis, Emmanouil T.; Ardlie, Kristin; Guigó, Roderic

doi:10.1126/science.aaa0355

Cited by 1,159 publications

(1,263 citation statements)

References 64 publications

(32 reference statements)

Supporting

Mentioning

1,166

Contrasting

Unclassified

Order By: Relevance

“…While Melé et al (2015) were able to reasonably differentiate the first eight tissue types using hierarchical clustering, nerve tissue was not easily distinguishable from other tissue types. To investigate this disparity, we performed GO enrichment of the top 100 RF decision split genes for nerve tissue (Table S1) and identified 31 genes involved in nervous system development at a BC P-value of 9.2 3 10 25 .…”

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

Detecting Sources of Transcriptional Heterogeneity in Large-Scale RNA-Seq Data Sets

Searle¹,

Gittelman²,

Manor³

et al. 2016

Genetics

View full text Add to dashboard Cite

Gene expression levels are dynamic molecular phenotypes that respond to biological, environmental, and technical perturbations. Here we use a novel replicate-classifier approach for discovering transcriptional signatures and apply it to the GenotypeTissue Expression data set. We identified many factors contributing to expression heterogeneity, such as collection center and ischemia time, and our approach of scoring replicate classifiers allows us to statistically stratify these factors by effect strength. Strikingly, from transcriptional expression in blood alone we detect markers that help predict heart disease and stroke in some patients. Our results illustrate the challenges and opportunities of interpreting patterns of transcriptional variation in large-scale data sets.KEYWORDS GTEx Consortium; gene expression normalization; Random Forest classification; transcriptional heterogeneity U NLIKE previous large-scale tissue- (FANTOM Consortium et al. 2015) or cell type-(ENCODE Project Consortium 2012) specific expression data sets, the Genotype-Tissue Expression (GTEx) project (GTEx Consortium 2015) is unique in the breadth of tissue types sampled from the same individuals. The GTEx Consortium has previously demonstrated that tissue-specific gene expression signatures are preserved in postmortem samples using hierarchical clustering (Melé et al. 2015), which groups samples by gene expression using a datadriven approach to identify hidden structure in the data. While hierarchical clustering is effective at identifying the greatest global source of variation, it does not capture more subtle sources of variation. For example, in the context of the GTEx project, hierarchical clustering largely captures gene expression variation due to tissue type, but less effectively captures the influence of confounding factors like age or sex.Using the GTEx pilot data freeze version 4, we attempted to recapitulate the results of hierarchical clustering using supervised Random Forest (RF) classification (Breiman 2001). Unlike hierarchical clustering, RF uses sample type annotations in a training data set to create decision trees, where the nodes correspond to genes whose expression levels distinguish between tissue types. Although RF classification typically considers a single classifier per classification task, we randomly generated replicate classifiers to statistically assess how well two groups can be distinguished. This approach is markedly distinct from hierarchical clustering or principal component analysis and enables statistical uncertainty to be rigorously quantified. These analyses reveal strong transcriptional signatures that contribute to patterns of expression heterogeneity in the GTEx data. More broadly, our results highlight that a deeper understanding of the determinants of transcriptional variation enable insights into the biological factors that govern variation in gene expression among tissues and individuals. Materials and Methods Normalization and data curatingWe first removed samples of non-Europea...

show abstract

Section: Resultsmentioning

confidence: 99%

“…Previous analyses of the GTEx data showed that tissue type could be accurately predicted from gene expression data for many, but not all, cell types (Melé et al 2015). We first attempted to demonstrate the strength of tissue-type gene expression signatures in the GTEx data using a novel classification algorithm.…”

Section: Resultsmentioning

confidence: 99%

Detecting Sources of Transcriptional Heterogeneity in Large-Scale RNA-Seq Data Sets

Searle¹,

Gittelman²,

Manor³

et al. 2016

Genetics

View full text Add to dashboard Cite

show abstract

“…Action of these hormones extends at the epigenetic level to DNA methylation and chromatin conformation 16,17 . Analysis of gene expression in different non reproductive tissues indicates the existence of a gender specific influence on transcription 18,19 , which is paralleled at the level of chromatin organization, by specific differences between the sexes 20 . As discussed here below, sex hormone signaling pathways are likely to affect cancer susceptibility through multiple mechanisms, impacting on intrinsic self renewal mechanisms, tumor microenvironment, immune system and metabolism.…”

Section: Introduction (Epidemiology)mentioning

confidence: 99%

Sexual dimorphism in cancer

et al. 2016

View full text Add to dashboard Cite

The incidence of many cancer types is significantly higher in the male than female populations, with associated differences in survival. Occupational and/or behavioral factors are well known underlying determinants. However, cellular/molecular differences between the two sexes are also likely to be important. We are focusing here on the complex interplay that sexual hormones and sex chromosomes can have in intrinsic control of cancer initiating cell populations, tumor microenvironment and systemic determinants of cancer development like the immune system and metabolism. A better appreciation of these differences between the two sexes could be of substantial value for cancer prevention as well as treatment.3 Introduction (Epidemiology)

show abstract

“…The GTEx database contains RNA-Seq data from multiple tissues derived from multiple donors (Mele et al 2015). This allows the expression and splicing patterns of all annotated human genes to be tracked across different tissue types.…”

Section: Tissue-specific Expression Patterns Of Lincrnasmentioning

confidence: 99%

LINC00507 Is Specifically Expressed in the Primate Cortex and Has Age-Dependent Expression Patterns

et al. 2016

View full text Add to dashboard Cite

Disclaimer/Complaints regulationsIf you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: http://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. Abstract Over the past decade, there has been an increase in the appreciation of the role of non-coding RNA in the development of organism phenotype. It is possible to divide the non-coding elements of the transcriptome into three categories: short non-coding RNAs, circular RNAs and long non-coding RNAs. Long non-coding RNAs are those transcripts that are greater than 200 nts in length and lack any significant open reading frames that produce proteins greater then 100 amino acids. Long intervening non-coding RNAs (lincRNAs) are a subclass of long non-coding RNAs. In contrast to protein coding RNAs, lincRNAs are expressed in a more tissue-and species-specific manner. In particular, many lincRNAs are only conserved amongst higher primates. This coupled with the propensity of many lincRNAs to be expressed in the brain, suggests that they are in fact one of the major drivers of organism complexity. We analysed 39 lincRNAs that are expressed in the frontal cortex and identified LINC00507 as being expressed in a cortex-specific manner in non-human primates and humans. The expression patterns of LINC00507 appear to be age-dependent, suggesting it may be involved in brain development of higher primates. Moreover, the analysis of LINC00507 potential to bind ribosomes revealed that this previously identified non-coding transcript may harbour a micropeptide.

show abstract

The human transcriptome across tissues and individuals

Cited by 1,159 publications

References 64 publications

Detecting Sources of Transcriptional Heterogeneity in Large-Scale RNA-Seq Data Sets

Detecting Sources of Transcriptional Heterogeneity in Large-Scale RNA-Seq Data Sets

Sexual dimorphism in cancer

LINC00507 Is Specifically Expressed in the Primate Cortex and Has Age-Dependent Expression Patterns

Contact Info

Product

Resources

About