2010
DOI: 10.1093/bioinformatics/btq613
|View full text |Cite
|
Sign up to set email alerts
|

Biological impact of missing-value imputation on downstream analyses of gene expression profiles

Abstract: Motivation:Microarray experiments frequently produce multiple missing values (MVs) due to flaws such as dust, scratches, insufficient resolution or hybridization errors on the chips. Unfortunately, many downstream algorithms require a complete data matrix. The motivation of this work is to determine the impact of MV imputation on downstream analysis, and whether ranking of imputation methods by imputation accuracy correlates well with the biological impact of the imputation. Methods: Using eight datasets for d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
45
0

Year Published

2012
2012
2022
2022

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 49 publications
(48 citation statements)
references
References 48 publications
(66 reference statements)
1
45
0
Order By: Relevance
“…Previous studies have demonstrated that the performance of an imputation algorithm could be affected by many factors, hence no algorithm can perform well on all kinds of datasets [5], [6]. This phenomenon can also be observed using IMDE.…”
Section: B Comparison Of Different Imputation Algorithmss and Differmentioning
confidence: 97%
See 2 more Smart Citations
“…Previous studies have demonstrated that the performance of an imputation algorithm could be affected by many factors, hence no algorithm can perform well on all kinds of datasets [5], [6]. This phenomenon can also be observed using IMDE.…”
Section: B Comparison Of Different Imputation Algorithmss and Differmentioning
confidence: 97%
“…Second, NRMSE is the only performance index used in IMDE. Other biologically relevant performance indices such as the Conserved Pair Proportion (CPP) [3] and Biomarker List Concordance Index (BLCI) [5] should also be used. We will incorporate them in the future version of IMDE.…”
Section: Comparison With Other Imputation Toolsmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, the values assumed for φ 1 k in the simulation were the average values of the estimates obtained from each resulting cluster x i ∈ S j Y i −S j 2 , where S j is the mean of point in S j . The choice of k-means is due to its general use in gene expression clustering (Mar et al, 2011;Oh et al, 2011), but it is important emphasize that the success of this method depends of the stated number of clusters, which was the same number used in the simulation study (five), i.e. was considered the optimum condition for the application of this method.…”
Section: Application To Simulated Datamentioning
confidence: 99%
“…expression was not detected in seven or more samples, which excluded 100 microRNAs), missing values from remaining probes were imputed using R function imputeKNN in Bioconductor package MmPalateMicroRNA [119] (14.5% of values were imputed) [120]. A median sweep was performed to normalize delta Ct values by subtracting the global median for each array.…”
Section: Laser Capture Microdissectionmentioning
confidence: 99%