PPLine: An Automated Pipeline for SNP, SAP, and Splice Variant Detection in the Context of Proteogenomics

Krasnov, George S.; Dmitriev, Alexey A.; Kudryavtseva, Anna V.; Shargunov, A. V.; Карпов, Д. С.; Урошлев, Л. А.; Melnikova, N. V.; Блинов, В. М.; Poverennaya, Ekaterina V.; Archakov, Alexander I.; Lisitsa, Andrey; Ponomarenko, Elena A.

doi:10.1021/acs.jproteome.5b00490

Cited by 67 publications

(38 citation statements)

References 57 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Processing of transcriptomic data was performed using PPLine toolkit [8] including read preprocessing (trimmomatic), mapping (STAR) and counting (HTSeq-count). The further analysis was done with R programming language (R core Team).…”

Section: Methodsmentioning

confidence: 99%

The influence of pro-longevity gene Gclc overexpression on the age-dependent changes in Drosophila transcriptome and biological functions

et al. 2016

View full text Add to dashboard Cite

BackgroundTranscriptional changes that contribute to the organism’s longevity and prevent the age-dependent decline of biological functions are not well understood. Here, we overexpressed pro-longevity gene encoding glutamate-cysteine ligase catalytic subunit (Gclc) and analyzed age-dependent changes in transcriptome that associated with the longevity, stress resistance, locomotor activity, circadian rhythmicity, and fertility.ResultsHere we reproduced the life extension effect of neuronal overexpression of the Gclc gene and investigated its influence on the age-depended dynamics of transcriptome and biological functions such as fecundity, spontaneous locomotor activity and circadian rhythmicity, as well as on the resistance to oxidative, proteotoxic and osmotic stresses. It was shown that Gclc overexpression reduces locomotor activity in the young and middle ages compared to control flies. Gclc overexpression slowed down the age-dependent decline of locomotor activity and circadian rhythmicity, and resistance to stress treatments. Gclc level demonstrated associations with the expression of genes involved in a variety of cellular processes including Jak-STAT, MAPK, FOXO, Notch, mTOR, TGF-beta signaling pathways, translation, protein processing in endoplasmic reticulum, proteasomal degradation, glycolysis, oxidative phosphorylation, apoptosis, regulation of circadian rhythms, differentiation of neurons, synaptic plasticity and transmission.ConclusionsOur study revealed that Gclc overexpression induces transcriptional changes associated with the lifespan extension and uncovered pathways that may be associated with the age-dependent decline of biological functions.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-3356-0) contains supplementary material, which is available to authorized users.

show abstract

Section: Methodsmentioning

confidence: 99%

The influence of pro-longevity gene Gclc overexpression on the age-dependent changes in Drosophila transcriptome and biological functions

et al. 2016

View full text Add to dashboard Cite

show abstract

“…The results revealed 2172 and 149 differentially expressed splicesoforms respectively including RAC1, OSBPL3, MKI67, and SYK . PPLine is a python‐based proteogenomic pipeline assisting discovery of SAPs, INDELs, and ASVs from transcriptome and exome sequence data, besides facilitating the annotation and filtration of SNPs and the prediction of proteotypic peptides …”

Section: Current Development In Enabling Technologiesmentioning

confidence: 99%

Connecting Proteomics to Next‐Generation Sequencing: Proteogenomics and Its Current Applications in Biology

et al. 2018

View full text Add to dashboard Cite

Understanding the relationship between genotypes and phenotypes is essential to disentangle biological mechanisms and to unravel the molecular basis of diseases. Genes and proteins are closely linked in biological systems. However, genomics and proteomics have developed separately into two distinct disciplines whereby crosstalk among scientists from the two domains is limited and this constrains the integration of both fields into a single data modality of useful information. The emerging field of proteogenomics attempts to address this by building bridges between the two disciplines. In this review, how genomics and transcriptomics data in different formats can be utilized to assist proteogenomics application is briefly discussed. Subsequently, a much larger part of this review focuses on proteogenomics research articles that are published in the last five years that answer two important questions. First, how proteogenomics can be applied to tackle biological problems is discussed, covering genome annotation and precision medicine. Second, the latest developments in analytical technologies for data acquisition and the bioinformatics tools to interpret and visualize proteogenomics data are covered.

show abstract

“… Datasets to map the canonical and spliced forms of missing proteins of Chr18 in the liver tissue ( a ) and in the HepG2 cell line ( b ). PE: protein evidence according to neXtProt; DB: information from several mass-spectrometry databases on protein detection in the biosample; Chr18 HPP transcriptomic data [ 19 , 20 , 26 ]; CF: level of expression of canonical form; S2–S7: levels of expression of the splice forms. Colored boxes represent the quantitative value assigned to the descriptor.…”

Section: Figurementioning

confidence: 99%

The Gene-Centric Content Management System and Its Application for Cognitive Proteomics

et al. 2018

View full text Add to dashboard Cite

The Human Proteome Project is moving into the next phase of creating and/or reconsidering the functional annotations of proteins using the chromosome-centric paradigm. This challenge cannot be solved exclusively using automated means, but rather requires human intelligence for interpreting the combined data. To foster the integration between human cognition and post-genome array a number of specific tools were recently developed, among them CAPER, GenomewidePDB, and The Proteome Browser (TPB). For the purpose of tackling the task of protein functional annotating the Gene-Centric Content Management System (GenoCMS) was expanded with new features. The goal was to enable bioinformaticans to develop self-made applications and to position these applets within the generalized informational canvas supported by GenoCMS. We report the results of GenoCMS-enabled integration of the concordant informational flows in the chromosome-centric framework of the human chromosome 18 project. The workflow described in the article can be scaled to other human chromosomes, and also supplemented with new tracks created by the user. The GenoCMS is an example of a project-oriented informational system, which are important for public data sharing.

show abstract

PPLine: An Automated Pipeline for SNP, SAP, and Splice Variant Detection in the Context of Proteogenomics

Cited by 67 publications

References 57 publications

The influence of pro-longevity gene Gclc overexpression on the age-dependent changes in Drosophila transcriptome and biological functions

The influence of pro-longevity gene Gclc overexpression on the age-dependent changes in Drosophila transcriptome and biological functions

Connecting Proteomics to Next‐Generation Sequencing: Proteogenomics and Its Current Applications in Biology

The Gene-Centric Content Management System and Its Application for Cognitive Proteomics

Contact Info

Product

Resources

About