Label-free quantification (LFQ) is one of the most efficient approaches for quantifying proteome differences between multiple states of a biological system. LFQ aims to reproducibly identify and quantify peptides through multiple liquid-chromatography-coupled tandem mass spectrometry (LC-MS/MS) experiments. In the popular data-dependent acquisition (DDA) approach named Top-N DDA, the appearance of a peptide-like signal in a "survey" mass spectrum triggers a tandem mass spectrometry (MS/MS) event, targeting the (N) most-abundant precursor ions. Previous studies have shown that, due to the limited speed of a mass spectrometer, the majority of peptide ions detected in MS 1 are not targeted in MS/MS, especially when a nonfractionated complex sample is analyzed (1, 2). This low sampling efficiency (Ͻ50%), combined with the stochastic nature of precursor selection and a limited efficiency of MS/MS identification (Ͻ70%) (3), frequently causes the absence of MS/MS identification for an individual peptide in some LC-MS/MS experiments ("runs") within a larger dataset, even when replicate measurements are made (4). This deficiency is known as the missing value problem in LFQ. The problem significantly limits the size of the DDA-acquired proteomics dataset across which reliable quantification can be made for each protein (5, 6).One of the causes of the missing value problem is the traditional focus on the process of identifying a peptide as opposed to its quantification. For historical reasons, peptide sequence identification has been considered the focal point and the most important step in the whole proteomics procedure, while quantification came as almost an afterthought (7,8). This dominant proteomics paradigm can be characterized as the identification-centered approach, also known as a spectrum-centric approach (9). Only gradually the missing value problem has been identified as one of the biggest drawbacks of the DDA approach (4, 5). To address the reproducibility issue in MS/MS identification, several alternative data acquisition strategies had been suggested, including targeted (10) and semi-targeted (11, 12) approaches. However, none of the improved DDA strategies has solved the missing value problem anywhere close to the data-independent acquisition (DIA) (13,14). The latter approach, however, typically provides somewhat lower depth and breadth of the proteome coverage than the DDA methods.In our opinion, the DDA-associated missing value problem is caused by the sequential execution of two independent processes: peptide identification by MS/MS and its quantification by MS 1 . At first glance, performing MS 1 -based quantification simultaneously with MS/MS identification should provide an obvious solution to the missing value problem. Since MS 1 spectra contain many more peptide ions than are selected for MS/MS in DDA (or identified in DIA), the peptide's mass information is practically always present when an iden-
Most implementations of mass spectrometry-based proteomics involve enzymatic digestion of proteins, expanding the analysis to multiple proteolytic peptides for each protein. Currently, there is no consensus of how to summarize peptides' abundances to protein concentrations, and such efforts are complicated by the fact that error control normally is applied to the identification process, and do not directly control errors linking peptide abundance measures to protein concentration. Peptides resulting from suboptimal digestion or being partially modified are not representative of the protein concentration. Without a mechanism to remove such unrepresentative peptides, their abundance adversely impacts the estimation of their protein's concentration. Here, we present a relative quantification approach, Diffacto, that applies factor analysis to extract the covariation of peptides' abundances. The method enables a weighted geometrical average summarization and automatic elimination of incoherent peptides. We demonstrate, based on a set of controlled label-free experiments using standard mixtures of proteins, that the covariation structure extracted by the factor analysis accurately reflects protein concentrations. In the 1% peptide-spectrum match-level FDR data set, as many as 11% of the peptides have abundance differences incoherent with the other peptides attributed to the same protein. If not controlled, such contradicting peptide abundance have a severe impact on protein quantifications. When adding the quantities of each protein's three most abundant peptides, we note as many as 14% of the proteins being estimated as having a negative correlation with their actual concentration differences between samples. Diffacto reduced the amount of such obviously incorrectly quantified proteins to 1.6%. Furthermore, by analyzing clinical data sets from two breast cancer studies, our method revealed the persistent proteomic signatures linked to three subtypes of breast cancer. We conclude that Diffacto can facilitate the interpretation and enhance the utility of most types of proteomics data.
Based on conventional data-dependent acquisition strategy of shotgun proteomics, we present a new workflow DeMix, which significantly increases the efficiency of peptide identification for in-depth shotgun analysis of complex proteomes. Capitalizing on the high resolution and mass accuracy of Orbitrap-based tandem mass spectrometry, we developed a simple deconvolution method of “cloning” chimeric tandem spectra for cofragmented peptides. Additional to a database search, a simple rescoring scheme utilizes mass accuracy and converts the unwanted cofragmenting events into a surprising advantage of multiplexing. With the combination of cloning and rescoring, we obtained on average nine peptide-spectrum matches per second on a Q-Exactive workbench, whereas the actual MS/MS acquisition rate was close to seven spectra per second. This efficiency boost to 1.24 identified peptides per MS/MS spectrum enabled analysis of over 5000 human proteins in single-dimensional LC-MS/MS shotgun experiments with an only two-hour gradient. These findings suggest a change in the dominant “one MS/MS spectrum - one peptide” paradigm for data acquisition and analysis in shotgun data-dependent proteomics. DeMix also demonstrated higher robustness than conventional approaches in terms of lower variation among the results of consecutive LC-MS/MS runs.
Deconvolution of targets and action mechanisms of anticancer compounds is fundamental in drug development. Here, we report on ProTargetMiner as a publicly available expandable proteome signature library of anticancer molecules in cancer cell lines. Based on 287 A549 adenocarcinoma proteomes affected by 56 compounds, the main dataset contains 7,328 proteins and 1,307,859 refined protein-drug pairs. These proteomic signatures cluster by compound targets and action mechanisms. The targets and mechanistic proteins are deconvoluted by partial least square modeling, provided through the website http://protargetminer.genexplain.com. For 9 molecules representing the most diverse mechanisms and the common cancer cell lines MCF-7, RKO and A549, deep proteome datasets are obtained. Combining data from the three cell lines highlights common drug targets and cell-specific differences. The database can be easily extended and merged with new compound signatures. ProTargetMiner serves as a chemical proteomics resource for the cancer research community, and can become a valuable tool in drug discovery.
The human blood proteome is frequently assessed by protein abundance profiling using a combination of liquid chromatography and tandem mass spectrometry (LC-MS/MS). In traditional sequence database search, many good-quality MS/MS data remain unassigned. Here we uncover the hidden part of the blood proteome via novel SpotLight approach. This method combines de novo MS/MS sequencing of enriched antibodies and co-extracted proteins with subsequent label-free quantification of new and known peptides in both enriched and unfractionated samples. In a pilot study on differentiating early stages of Alzheimer’s disease (AD) from Dementia with Lewy Bodies (DLB), on peptide level the hidden proteome contributed almost as much information to patient stratification as the apparent proteome. Intriguingly, many of the new peptide sequences are attributable to antibody variable regions, and are potentially indicative of disease etiology. When the hidden and apparent proteomes are combined, the accuracy of differentiating AD (n = 97) and DLB (n = 47) increased from ≈85% to ≈95%. The low added burden of SpotLight proteome analysis makes it attractive for use in clinical settings.
BackgroundMalondialdehyde (MDA) is generated during lipid peroxidation as in oxidized low‐density lipoprotein, but antibodies against oxidized low‐density lipoprotein show variable results in clinical studies. We therefore studied the risk of cardiovascular disease (CVD) associated with IgM antibodies against MDA conjugated with human albumin (anti‐MDA).Methods and ResultsIn a 5‐ to 7‐year follow‐up of 60‐year‐old men and women from Stockholm County previously screened for cardiovascular risk factors (2039 men, 2193 women), 209 incident CVD cases (defined as new events of coronary heart disease, fatal and nonfatal myocardial infarction, ischemic stroke, and hospitalization for angina pectoris) and 620 age‐ and sex‐matched controls were tested for IgM anti‐MDA by ELISA. Antibody peptide/protein characterization was done using a proteomics de novo sequencing approach. After adjustment for smoking, body‐mass index, type 2 diabetes mellitus, hyperlipidemia, and hypertension, an increased CVD risk was observed in the low IgM anti‐MDA percentiles (below 10th and 25th) (odds ratio and 95% CI: 2.0; 1.19–3.36 and 1.67; 1.16–2.41, respectively). Anti‐MDA above the 66th percentile was associated with a decreased CVD risk (odds ratio 0.68; CI: 0.48–0.98). After stratification by sex, associations were only present among men. IgM anti‐MDA levels were lower among cases (median [interquartile range]: 141.0 [112.7–164.3] versus 147.4 [123.5–169.6]; P=0.0177), even more so among men (130.6 [107.7–155.3] versus 143.0 [120.1–165.2]; P=0.001). The IgM anti‐MDA variable region profiles are distinctly different and also more homologous in their content (correlates strongly with fewer peptides) than control antibodies (not binding MDA).ConclusionsIgM anti‐MDA is a protection marker for CVD. This finding could have diagnostic and therapeutic implications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.