2014
DOI: 10.1021/pr500812t
|View full text |Cite
|
Sign up to set email alerts
|

Flexible and Accessible Workflows for Improved Proteogenomic Analysis Using the Galaxy Framework

Abstract: Proteogenomics combines large-scale genomic and transcriptomic data with mass-spectrometry-based proteomic data to discover novel protein sequence variants and improve genome annotation. In contrast with conventional proteomic applications, proteogenomic analysis requires a number of additional data processing steps. Ideally, these required steps would be integrated and automated via a single software platform offering accessibility for wet-bench researchers as well as flexibility for user-specific customizati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
81
0

Year Published

2015
2015
2018
2018

Publication Types

Select...
9

Relationship

3
6

Authors

Journals

citations
Cited by 87 publications
(83 citation statements)
references
References 59 publications
1
81
0
Order By: Relevance
“…The veracity of putative variant sequences matched to MS/MS spectra must be confirmed, which can be accomplished by querying the variant peptide sequences against NCBI’s non-redundant (nr) protein database using the BLASTP tool, which is implemented in Galaxy 4 . Those peptides which do not have a 100% alignment and sequence match to known sequences within the database qualify as verified variant sequences, which are then passed on for further analysis 3, 4 .…”
Section: Use Casesmentioning
confidence: 99%
See 1 more Smart Citation
“…The veracity of putative variant sequences matched to MS/MS spectra must be confirmed, which can be accomplished by querying the variant peptide sequences against NCBI’s non-redundant (nr) protein database using the BLASTP tool, which is implemented in Galaxy 4 . Those peptides which do not have a 100% alignment and sequence match to known sequences within the database qualify as verified variant sequences, which are then passed on for further analysis 3, 4 .…”
Section: Use Casesmentioning
confidence: 99%
“…One example is emerging “multi-omic” analyses, which integrate software from different ‘omic domains and are well suited to the strengths of Galaxy 2 . For example, proteogenomics integrates tools for RNA-Seq assembly and analysis, software for matching tandem mass spectrometry (MS/MS) data to peptide and protein sequences, and other customized tools to characterize novel, variant protein sequences expressed within a sample 3, 4 . To enable compatibility between the software tools composing a proteogenomics workflow, tabular files often must be manipulated into appropriate formats recognized by specific tools.…”
Section: Introductionmentioning
confidence: 99%
“…Galaxy enables integration of disparate, multi-omic tools in a single, user-friendly environment, as required for proteogenomics. (6,9,10) The new resource described here provides workflows and training in the most critical aspects of proteogenomics: generation of customized protein sequence databases from RNA-Seq data, matching of MS/MS data to putative variant peptide sequences, and confirmation of the novelty of these identified sequences.…”
Section: Introductionmentioning
confidence: 99%
“…10, 2022 The whole process can be laborious and error-prone. A number of computational methods have been proposed for generating customized database in recent years, 2333 which can be classified into three categories: (i) DNA polymorphism/mutation (hereafter using “mutation” for simplicity) database, (ii) splice junction database, and (iii) genome/transcriptome six-frame translation (6FT) database for detecting non-canonical translation events in unknown coding DNA sequence regions.…”
Section: Introductionmentioning
confidence: 99%