Reproducible research is the bedrock of experimental science. To enable the deployment of large-scale proteomics, we assess the reproducibility of mass spectrometry (MS) over time and across instruments and develop computational methods for improving quantitative accuracy. We perform 1560 data independent acquisition (DIA)-MS runs of eight samples containing known proportions of ovarian and prostate cancer tissue and yeast, or control HEK293T cells. Replicates are run on six mass spectrometers operating continuously with varying maintenance schedules over four months, interspersed with~5000 other runs. We utilise negative controls and replicates to remove unwanted variation and enhance biological signal, outperforming existing methods. We also design a method for reducing missing values. Integrating these computational modules into a pipeline (ProNorM), we mitigate variation among instruments over time and accurately predict tissue proportions. We demonstrate how to improve the quantitative analysis of large-scale DIA-MS data, providing a pathway toward clinical proteomics.
This paper considers near-real time detection of beetle infestation in North American pine forests using MODIS 8-days 500 m data. Two methods are considered, both using a single time series for detection of beetle infestation by analyzing the statistics of the trend component of the signal. The first method estimates the trend component of the vegetation index time series by fitting an underlying triply modulated cosine model over a sliding window, using nonlinear least squares (NLS), and the second method uses a -point moving average finite impulse response (FIR) filter. Both the methods perform well and show similar performance on simulated datasets. The methods are also tested on many difference and ratio-indices of a real-world dataset with change and no-change examples taken from the Rocky Mountain region of the United States and of British Columbia in Canada. The results suggest that both the methods detect beetle infestation reliably in almost all the vegetation index datasets. However, the model-based method (NLSbased) performs better in terms of the detection delay. Red Green Index (RGI), when used with the model-based method, provides the best tradeoff between the detection delay and accuracy. Furthermore, 90%, 50%, and 25% cross-validations are also performed for the threshold selection on RGI dataset, and it is shown that the selected threshold works well on the test data. In the end, it is also shown that the model-based method outperforms a recently published method for near-real time disturbance detection in MODIS data, in both accuracy and detection delay.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.