We report the results obtained in 2012-2013 by the Russian Consortium for the Chromosome-centric Human Proteome Project (C-HPP). The main scope of this work was the transcriptome profiling of genes on human chromosome 18 (Chr 18), as well as their encoded proteome, from three types of biomaterials: liver tissue, the hepatocellular carcinoma-derived cell line HepG2, and blood plasma. The transcriptome profiling for liver tissue was independently performed using two RNaseq platforms (SOLiD and Illumina) and also by droplet digital PCR (ddPCR) and quantitative RT-PCR. The proteome profiling of Chr 18 was accomplished by quantitatively measuring protein copy numbers in the three types of biomaterial (the lowest protein concentration measured was 10(-13) M) using selected reaction monitoring (SRM). In total, protein copy numbers were estimated for 228 master proteins, including quantitative data on 164 proteins in plasma, 171 in the HepG2 cell line, and 186 in liver tissue. Most proteins were present in plasma at 10(8) copies/μL, while the median abundance was 10(4) and 10(5) protein copies per cell in HepG2 cells and liver tissue, respectively. In summary, for liver tissue and HepG2 cells a "transcriptoproteome" was produced that reflects the relationship between transcript and protein copy numbers of the genes on Chr 18. The quantitative data acquired by RNaseq, PCR, and SRM were uploaded into the "Update_2013" data set of our knowledgebase (www.kb18.ru) and investigated for linear correlations.
Long-read direct RNA sequencing developed by Oxford Nanopore Technologies (ONT) is quickly gaining popularity for transcriptome studies, while fast turnaround time and low cost make it an attractive instrument for clinical applications. There is a growing interest to utilize transcriptome data to unravel activated biological processes responsible for disease progression and response to therapies. This trend is of particular interest for precision medicine which aims at single-patient analysis. Here we evaluated whether gene abundances measured by MinION direct RNA sequencing are suited to produce robust estimates of pathway activation for single sample scoring methods. We performed multiple RNA-seq analyses for a single sample that originated from the HepG2 cell line, namely five ONT replicates, and three replicates using Illumina NovaSeq. Two pathway scoring methods were employed—ssGSEA and singscore. We estimated the ONT performance in terms of detected protein-coding genes and average pairwise correlation between pathway activation scores using an exhaustive computational scheme for all combinations of replicates. In brief, we found that at least two ONT replicates are required to obtain reproducible pathway scores for both algorithms. We hope that our findings may be of interest to researchers planning their ONT direct RNA-seq experiments.
Using reverse transcription in conjunction with the quantitative real-time PCR or digital droplet PCR, the transcriptome profiling of human chromosome 18 has been carried out in liver hepatocytes and hepatoblastoma cells (HepG2 cell line) in terms of the absolute number of each transcript per cell. The transcript abundance varies within the range of 0.006 to 9635 and 0.011 to 4819 copies per cell for HepG2 cell line and hepatocytes, respectively. The expression profiles for genes of chromosome 18 in hepatocytes and HepG2 cells were found to significantly correlate: the Spearman's correlation coefficient was equal to 0.81. The distribution of frequency of transcripts over their abundance was bimodal for HepG2 cells and unimodal for liver hepatocytes. Bioinformatic analysis of the differential gene expression has revealed that genes of chromosome 18, overexpressed in HepG2 cells compared to hepatocytes, are associated with cell division and cell adhesion processes. It is assumed that the enhanced expression of those genes in HepG2 cells is related to the proliferation activity of cultured cells. The differences in transcriptome profiles have to be taken into account when modelling liver hepatocytes with cultured HepG2 cells.
Missing (MP) and functionally uncharacterized proteins (uPE1) comprise less than 5% of the total number of proteins encoded by human Chr18 genes. Within half a year, since the January 2020 version of NextProt, the number of entries in the MP+uPE1 datasets changed, mainly due to the achievements of antibody-based proteomics. Assuming that the proteome is closely related to the transcriptome scaffold, quantitative PCR, Illumina HiSeq, and Oxford Nanopore Technology were applied to characterize the liver samples of three male donors in comparison with the HepG2 cell line. The data mining of the Expression Atlas (EMBL-EBI) and the profiling of biopsy samples by using orthogonal methods of transcriptome analysis have shown that in HepG2 cells and the liver, the genes encoding functionally uncharacterized proteins (uPE1) are expressed as low as for the missing proteins (less than 1 copy per cell), except the selected cases of HSBP1L1, TMEM241, C18orf21, and KLHL14. The initial expectation that uPE1 genes might be expressed at higher levels than MP genes, was compromised by severe discrepancies in our semi-quantitative gene expression data and in public databanks. Such discrepancy forced us to revisit the transcriptome of Chr18, the target of the Russian C-HPP Consortium. Tanglegram of highly expressed genes and further correlation analysis have shown the severe dependencies on the mRNA extraction method and the analytical platform. Targeted gene expression analysis by quantitative PCR (qPCR) and high-throughput transcriptome profiling (Illumina HiSeq and ONT MinION) for the same set of samples from normal liver tissue and HepG2 cells revealed the detectable expression of 250+ (92%) protein-coding genes of Chr18 (at least one method). The expression of slightly more than 50% protein-coding genes was detected simultaneously by all three methods. Correlation analysis of the gene expression profiles showed that the grouping of the datasets depended almost equally on both the type of biological material and the experimental method, particularly cDNA/mRNA isolation and library preparation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.