Quantitative measurements produced by tandem mass spectrometry
proteomics experiments typically contain a large proportion of missing
values. Missing values hinder reproducibility, reduce statistical
power, and make it difficult to compare across samples or experiments.
Although many methods exist for imputing missing values, in practice,
the most commonly used methods are among the worst performing. Furthermore,
previous benchmarking studies have focused on relatively simple measurements
of error such as the mean-squared error between imputed and held-out
values. Here we evaluate the performance of commonly used imputation
methods using three practical, “downstream-centric”
criteria. These criteria measure the ability to identify differentially
expressed peptides, generate new quantitative peptides, and improve
the peptide lower limit of quantification. Our evaluation comprises
several experiment types and acquisition strategies, including data-dependent
and data-independent acquisition. We find that imputation does not
necessarily improve the ability to identify differentially expressed
peptides but that it can identify new quantitative peptides and improve
the peptide lower limit of quantification. We find that MissForest
is generally the best performing method per our downstream-centric
criteria. We also argue that existing imputation methods do not properly
account for the variance of peptide quantifications and highlight
the need for methods that do.