Proteogenomics 101: a primer on database search strategies

Raj, Anurag; Aggarwal, Suruchi; Kumar, Dhirendra; Yadav, Amit Kumar; Dash, Debasis

doi:10.1007/s42485-023-00118-4

J Proteins Proteom

2023

DOI: 10.1007/s42485-023-00118-4

|View full text |Cite

Proteogenomics 101: a primer on database search strategies

Anurag Raj,

Suruchi Aggarwal,

Dhirendra Kumar

et al.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article2

Relationship

Self Cite0

Independent2

Authors

Journals

Cited by 2 publications

References 129 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Community Resource: Large-Scale Proteogenomics to Refine Wheat Genome Annotations

Vincent,

Appels

2024

IJMS

View full text Add to dashboard Cite

Triticum aestivum is an important crop whose reference genome (International Wheat Genome Sequencing Consortium (IWGSC) RefSeq v2.1) offers a valuable resource for understanding wheat genetic structure, improving agronomic traits, and developing new cultivars. A key aspect of gene model annotation is protein-level evidence of gene expression obtained from proteomics studies, followed up by proteogenomics to physically map proteins to the genome. In this research, we have retrieved the largest recent wheat proteomics datasets publicly available and applied the Basic Local Alignment Search Tool (tBLASTn) algorithm to map the 861,759 identified unique peptides against IWGSC RefSeq v2.1. Of the 92,719 hits, 83,015 unique peptides aligned along 33,612 High Confidence (HC) genes, thus validating 31.4% of all wheat HC gene models. Furthermore, 6685 unique peptides were mapped against 3702 Low Confidence (LC) gene models, and we argue that these gene models should be considered for HC status. The remaining 2934 orphan peptides can be used for novel gene discovery, as exemplified here on chromosome 4D. We demonstrated that tBLASTn could not map peptides exhibiting mid-sequence frame shift. We supply all our proteogenomics results, Galaxy workflow and Python code, as well as Browser Extensible Data (BED) files as a resource for the wheat community via the Apollo Jbrowse, and GitHub repositories. Our workflow could be applied to other proteomics datasets to expand this resource with proteins and peptides from biotically and abiotically stressed samples. This would help tease out wheat gene expression under various environmental conditions, both spatially and temporally.

show abstract

Community Resource: Large-Scale Proteogenomics to Refine Wheat Genome Annotations

Vincent,

Appels

2024

IJMS

View full text Add to dashboard Cite

show abstract

Phenotyping Tumor Heterogeneity through Proteogenomics: Study Models and Challenges

Piana,

Iavarone,

De Paolis

et al. 2024

IJMS

View full text Add to dashboard Cite

Tumor heterogeneity refers to the diversity observed among tumor cells: both between different tumors (inter-tumor heterogeneity) and within a single tumor (intra-tumor heterogeneity). These cells can display distinct morphological and phenotypic characteristics, including variations in cellular morphology, metastatic potential and variability treatment responses among patients. Therefore, a comprehensive understanding of such heterogeneity is necessary for deciphering tumor-specific mechanisms that may be diagnostically and therapeutically valuable. Innovative and multidisciplinary approaches are needed to understand this complex feature. In this context, proteogenomics has been emerging as a significant resource for integrating omics fields such as genomics and proteomics. By combining data obtained from both Next-Generation Sequencing (NGS) technologies and mass spectrometry (MS) analyses, proteogenomics aims to provide a comprehensive view of tumor heterogeneity. This approach reveals molecular alterations and phenotypic features related to tumor subtypes, potentially identifying therapeutic biomarkers. Many achievements have been made; however, despite continuous advances in proteogenomics-based methodologies, several challenges remain: in particular the limitations in sensitivity and specificity and the lack of optimal study models. This review highlights the impact of proteogenomics on characterizing tumor phenotypes, focusing on the critical challenges and current limitations of its use in different clinical and preclinical models for tumor phenotypic characterization.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Proteogenomics 101: a primer on database search strategies

Cited by 2 publications

References 129 publications

Community Resource: Large-Scale Proteogenomics to Refine Wheat Genome Annotations

Community Resource: Large-Scale Proteogenomics to Refine Wheat Genome Annotations

Phenotyping Tumor Heterogeneity through Proteogenomics: Study Models and Challenges

Contact Info

Product

Resources

About