2023
DOI: 10.1021/acs.jproteome.3c00205
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating Proteomics Imputation Methods with Improved Criteria

Lincoln Harris,
William E. Fondrie,
Sewoong Oh
et al.

Abstract: Quantitative measurements produced by tandem mass spectrometry proteomics experiments typically contain a large proportion of missing values. Missing values hinder reproducibility, reduce statistical power, and make it difficult to compare across samples or experiments. Although many methods exist for imputing missing values, in practice, the most commonly used methods are among the worst performing. Furthermore, previous benchmarking studies have focused on relatively simple measurements of error such as the … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 62 publications
0
3
0
Order By: Relevance
“…In the proteomic literature, one of the most commonly applied imputation procedures is a version of k -nearest neighbors (kNN). The standard kNN procedure in use involves imputing the missing values based on the mean of the k closest peptides (Harris et al, 2023). Close peptides are used instead of close samples because it is very difficult to define close samples when there is excessive missingness.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…In the proteomic literature, one of the most commonly applied imputation procedures is a version of k -nearest neighbors (kNN). The standard kNN procedure in use involves imputing the missing values based on the mean of the k closest peptides (Harris et al, 2023). Close peptides are used instead of close samples because it is very difficult to define close samples when there is excessive missingness.…”
Section: Discussionmentioning
confidence: 99%
“…A substantial ongoing research effort is to search and experiment with numerous imputation methods to determine the optimal one, including sample matching methods (Stuart and Satija, 2019), matrix factorization methods (Hastie et al, 2015), deep learning methods (Yoon et al, 2018; Qiu et al, 2020; Du et al, 2022) and more (Wang et al, 2016; Chen et al, 2017). See Harris et al (2023); Välikangas et al (2018); Liu and Dongre (2021) for a comprehensive review.…”
Section: Introductionmentioning
confidence: 99%
“…If one ion count is missing, a ratio is still calculated, but rather than treating the missing value as a zero (as commonly employed), it is dropped, and the ratio and P -value are calculated using one fewer replicate for either the test or control condition (Figure C, case ii ). This treats the single missing value as missing at random (MAR) and avoids artificially decreasing the calculated mean. , The maximum number of allowable missing values for an ion to be further considered can be modified by the user (default is one) and may be recommended in studies with greater than three replicates per condition.…”
Section: Computational Sectionmentioning
confidence: 99%