2009
DOI: 10.1021/pr900610q
|View full text |Cite
|
Sign up to set email alerts
|

Protein Quantification in Label-Free LC-MS Experiments

Abstract: The goal of many LC-MS proteomic investigations is to quantify and compare the abundance of proteins in complex biological mixtures. However, the output of an LC-MS experiment is not a list of proteins, but a list of quantified spectral features. To make protein-level conclusions, researchers typically apply ad hoc rules, or take an average of feature abundance to obtain a single protein-level quantity for each sample. We argue that these two approaches are inadequate. We discuss two statistical models, namely… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

1
110
0

Year Published

2010
2010
2017
2017

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 98 publications
(112 citation statements)
references
References 13 publications
1
110
0
Order By: Relevance
“…To identify proteins most significantly affected by the stimulus and to address the challenges posed by missing values, we used linear mixed-effects modeling (LiME). LiME, an improvement over ad hoc cutoffs or simple feature averaging, takes advantage of inherent replicate structure of the data and leverages information from a series of biological conditions to identify the significantly affected proteins (26,27). For PRKDC, LiME analysis revealed a systematic increase in peak area for the [s/t]Q containing phosphoPSMs, with >100-fold increase (P < 0.001) between combo and control treatments (Fig.…”
Section: Resultsmentioning
confidence: 99%
“…To identify proteins most significantly affected by the stimulus and to address the challenges posed by missing values, we used linear mixed-effects modeling (LiME). LiME, an improvement over ad hoc cutoffs or simple feature averaging, takes advantage of inherent replicate structure of the data and leverages information from a series of biological conditions to identify the significantly affected proteins (26,27). For PRKDC, LiME analysis revealed a systematic increase in peak area for the [s/t]Q containing phosphoPSMs, with >100-fold increase (P < 0.001) between combo and control treatments (Fig.…”
Section: Resultsmentioning
confidence: 99%
“…Benchmark Peptide-based Model-We start from the peptidebased linear regression models as proposed by Daly et al (39) Clough et al (22) and Karpievitch et al (40), of which we have independently proven their superior performance compared to summarizationbased workflows (21). In general, the following model is proposed: (1) ridge regression, which leads to shrunken yet more stable log 2 fold change (FC) estimates, (2) Empirical Bayes estimation of the variance, which further stabilizes variance estimators, and (3) M-estimation with Huber weights, which reduces the impact of outlying peptide intensities.…”
Section: Methodsmentioning
confidence: 99%
“…Peptide-based linear regression models estimate protein fold changes directly from peptide intensities and outperform summarization-based methods by reducing bias and generating more correct precision estimates (21,22). However, peptide-based linear regression models suffer from overfitting due to extreme observations and the unbalanced nature of proteomics data; i.e.…”
mentioning
confidence: 99%
“…Hrydziuszko and Viant, 2011;Wang et al, 2012;WebbRobertson et al, 2015) which presents a significant challenge for statistical analysis (see e.g. Clough et al, 2009). Analysis of such datasets can follow one of two approaches of either eliminating missing values prior to analysis or using methods that integrate missing values in the testing procedure.…”
Section: Introductionmentioning
confidence: 99%