We report an integrated pipeline for efficient serum glycoprotein biomarker candidate discovery and qualification that may be used to facilitate cancer diagnosis and management. The discovery phase used semi-automated lectin magnetic bead array (LeMBA)-coupled tandem mass spectrometry with a dedicated data-housing and analysis pipeline; GlycoSelector (http://glycoselector.di. uq.edu.au). The qualification phase used lectin magnetic bead array-multiple reaction monitoring-mass spectrometry incorporating an interactive web-interface, Shiny mixOmics (http://mixomics-projects.di.uq.edu.au/Shiny), for univariate and multivariate statistical analysis. Relative quantitation was performed by referencing to a spiked-in glycoprotein, chicken ovalbumin. We applied this workflow to identify diagnostic biomarkers for esophageal adenocarcinoma (EAC), a life threatening malignancy with poor prognosis in the advanced setting. EAC develops from metaplastic condition Barrett's esophagus (BE). Currently diagnosis and monitoring of at-risk patients is through endoscopy and biopsy, which is expensive and requires hospital admission. Hence there is a clinical need for a noninvasive diagnostic biomarker of EAC. In total 89 patient samples from healthy controls, and patients with BE or EAC were screened in discovery and qualification stages. Of the 246 glycoforms measured in the qualification stage, 40 glycoforms (as measured by lectin affinity) qualified as candidate serum markers. The top candidate for distinguishing healthy from BE patients' group was Narcissus pseudonarcissus lectin (
RaftProt (http://lipid-raft-database.di.uq.edu.au/) is a database of mammalian lipid raft-associated proteins as reported in high-throughput mass spectrometry studies. Lipid rafts are specialized membrane microdomains enriched in cholesterol and sphingolipids thought to act as dynamic signalling and sorting platforms. Given their fundamental roles in cellular regulation, there is a plethora of information on the size, composition and regulation of these membrane microdomains, including a large number of proteomics studies. To facilitate the mining and analysis of published lipid raft proteomics studies, we have developed a searchable database RaftProt. In addition to browsing the studies, performing basic queries by protein and gene names, searching experiments by cell, tissue and organisms; we have implemented several advanced features to facilitate data mining. To address the issue of potential bias due to biochemical preparation procedures used, we have captured the lipid raft preparation methods and implemented advanced search option for methodology and sample treatment conditions, such as cholesterol depletion. Furthermore, we have identified a list of high confidence proteins, and enabled searching only from this list of likely bona fide lipid raft proteins. Given the apparent biological importance of lipid raft and their associated proteins, this database would constitute a key resource for the scientific community.
Cellular membranes feature dynamic submicrometer-scale lateral domains termed lipid rafts, membrane rafts or glycosphingolipid-enriched microdomains (GEM). Numerous proteomics studies have been conducted on the lipid raft proteome, however, interpretation of individual studies is limited by potential undefined contaminant proteins. To enable integrated analyses, we previously developed RaftProt (http://lipid-raft-database.di.uq.edu.au/), a searchable database of mammalian lipid raft-associated proteins. Despite being a highly used resource, further developments in annotation and utilities were required. Here, we present RaftProt V2 (http://raftprot.org), an improved update of RaftProt. Besides the addition of new datasets and re-mapping of all entries to both UniProt and UniRef IDs, we have implemented a stringent annotation based on experimental evidence level to assist in identification of possible contaminant proteins. RaftProt V2 allows for simultaneous search of multiple proteins/experiments at the cell/tissue type and UniRef/Gene level, where correlations, interactions or overlaps can be investigated. The web-interface has been completely re-designed to enable interactive data and subset selection, correlation analysis and network visualization. Overall, RaftProt aims to advance our understanding of lipid raft function through integrative analysis of datasets collected from diverse tissue and conditions. Database URL: http://raftprot.org.
The utility of high-throughput quantitative proteomics to identify differentially abundant proteins en-masse relies on suitable and accessible statistical methodology, which remains mostly an unmet need. We present a free web-based tool, called Quantitative Proteomics p-value Calculator (QPPC), designed for accessibility and usability by proteomics scientists and biologists. Being an online tool, there is no requirement for software installation. Furthermore, QPPC accepts generic peptide ratio data generated by any mass spectrometer and database search engine. Importantly, QPPC utilizes the permutation test that we recently found to be superior to other methods for analysis of peptide ratios because it does not assume normal distributions.1 QPPC assists the user in selecting significantly altered proteins based on numerical fold change, or standard deviation from the mean or median, together with the permutation p-value. Output is in the form of comma separated values files, along with graphical visualization using volcano plots and histograms. We evaluate the optimal parameters for use of QPPC, including the permutation level and the effect of outlier and contaminant peptides on p-value variability. The optimal parameters defined are deployed as default for the web-tool at http://qppc.di.uq.edu.au/ .
In the past decades a number of software programs have been developed to infer phylogenetic relationships between populations. However, most of these programs typically use alignments of sequences from genes to build phylogeny. Recently, many standalone or web applications have been developed to handle large-scale whole genome data, but they are either computationally intensive, dependent on third party software or required significant time and resource of a web server. In the post-genomic era, researchers are able to obtain bioinformatically processed high-quality publication-ready whole genome data for many individuals in a population from next generation sequencing companies due to the reduction in the cost of sequencing and analysis. Such genotype data is typically presented in the Variant Call Format (VCF) and there is no simple software available that directly uses this data format to construct the phylogeny of populations in a short time. To address this limitation, we have developed a user-friendly software, VCF2PopTree that uses genome-wide SNPs to construct and display phylogenetic trees in seconds to minutes. For example, it reads a VCF file containing 4 million SNPs and draws a tree in less than 30 seconds. VCF2PopTree accepts genotype data from a local machine, constructs a tree using UPGMA and Neighbour-Joining algorithms and displays it on a web-browser. It also produces pairwise-diversity matrix in MEGA and PHYLIP file formats as well as trees in the Newick format which could be directly used by other popular phylogenetic software programs. The software including the source code, a test VCF file and a documentation are available at: https://github.com/sansubs/vcf2pop.
This data article describes serum glycoprotein biomarker discovery and qualification datasets generated using lectin magnetic bead array (LeMBA) – mass spectrometry techniques, “Serum glycoprotein biomarker discovery and qualification pipeline reveals novel diagnostic biomarker candidates for esophageal adenocarcinoma” [1]. Serum samples collected from healthy, metaplastic Barrett׳s esophagus (BE) and esophageal adenocarcinoma (EAC) individuals were profiled for glycoprotein subsets via differential lectin binding. The biomarker discovery proteomics dataset consisting of 20 individual lectin pull-downs for 29 serum samples with a spiked-in internal standard chicken ovalbumin protein has been deposited in the PRIDE partner repository of the ProteomeXchange Consortium with the data set identifier PRIDE: PXD002442. Annotated MS/MS spectra for the peptide identifications can be viewed using MS-Viewer (〈http://prospector2.ucsf.edu/prospector/cgi-bin/msform.cgi?form=msviewer〉) using search key “jn7qafftux”. The qualification dataset contained 6-lectin pulldown-coupled multiple reaction monitoring-mass spectrometry (MRM-MS) data for 41 protein candidates, from 60 serum samples. This dataset is available as a supplemental files with the original publication [1].
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.