2015
DOI: 10.1002/elps.201500352
|View full text |Cite
|
Sign up to set email alerts
|

Missing value imputation strategies for metabolomics data

Abstract: The origin of missing values can be caused by different reasons and depending on these origins missing values should be considered differently and dealt with in different ways. In this research, four methods of imputation have been compared with respect to revealing their effects on the normality and variance of data, on statistical significance and on the approximation of a suitable threshold to accept missing data as truly missing. Additionally, the effects of different strategies for controlling familywise … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
109
0
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
7
1
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 132 publications
(112 citation statements)
references
References 23 publications
2
109
0
1
Order By: Relevance
“…There are different methods for dealing with missing values, and each method/approach impacts on downstream statistical analyses [31,50,51]. In the present study, the SIMCA software uses an adjusted nonlinear iterative partial least squares (NIPALS) algorithm (with a correction factor of 3.0) [52] in handling the missing values.…”
Section: Resultsmentioning
confidence: 99%
“…There are different methods for dealing with missing values, and each method/approach impacts on downstream statistical analyses [31,50,51]. In the present study, the SIMCA software uses an adjusted nonlinear iterative partial least squares (NIPALS) algorithm (with a correction factor of 3.0) [52] in handling the missing values.…”
Section: Resultsmentioning
confidence: 99%
“…The missing data were imputed by k-nearest neighbours algorithm combined with linear regression technique [53]. The proteins with significant fold change greater than 2 or lower than 0.5 were recognised as deregulated.…”
Section: Methodsmentioning
confidence: 99%
“…From a theoretical point of view, this low percentage is rather generous as the instrumentation should be sufficiently reliable to detect a larger percentage of features in each of the repeated injections of QC sample. Further indepth reading-recommendations relevant to both the possible reasons for missing values in metabolomics datasets as well as adequate solutions include articles by Gromski et al (Gromski et al 2014) and Armitage et al (Armitage et al 2015). Further refinement of this approach include analysis of a QC dilution series at the end of the analysis to identify features not responsive to dilution, these should then be excluded (Eliasson et al 2012;Vorkas et al 2015).…”
Section: Evaluation Of Data Qualitymentioning
confidence: 96%