2010
DOI: 10.1093/bib/bbq080
|View full text |Cite
|
Sign up to set email alerts
|

Missing value imputation for gene expression data: computational techniques to recover missing data from available information

Abstract: Microarray gene expression data generally suffers from missing value problem due to a variety of experimental reasons. Since the missing data points can adversely affect downstream analysis, many algorithms have been proposed to impute missing values. In this survey, we provide a comprehensive review of existing missing value imputation algorithms, focusing on their underlying algorithmic techniques and how they utilize local or global information from within the data, or their use of domain knowledge during i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
106
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 176 publications
(106 citation statements)
references
References 69 publications
0
106
0
Order By: Relevance
“…Liew et al (2011) and Moorthy et al (2014) reviewed the available methods and algorithms for the imputation of missing values with a focus on gene expression data. Johansson and Hakkinen (2006) proposed WeNNI, which utilizes continuous weights in the nearest neighbors imputation procedure.…”
Section: Introductionmentioning
confidence: 99%
“…Liew et al (2011) and Moorthy et al (2014) reviewed the available methods and algorithms for the imputation of missing values with a focus on gene expression data. Johansson and Hakkinen (2006) proposed WeNNI, which utilizes continuous weights in the nearest neighbors imputation procedure.…”
Section: Introductionmentioning
confidence: 99%
“…There are mainly two ways to explore the coherence information, namely the global and the local approaches [8]. The global approaches assume a global covariance structure in all genes [9,10] while the local approaches exploit correlations among certain genes only [11][12][13][14].…”
Section: Introductionmentioning
confidence: 99%
“…EM-PCA begins with the initialization of the missing data by quoting the average values in the corresponding rows and columns, and then their iterations in such a way to substitute missing data with values predicted (estimated) by the PCA [40]. The algorithm for missing data [41][42][43][44][45][46][47] is repeated until the convergence criteria. The mean values obtained in the convergence criterion replace missing data in the corresponding rows and columns.…”
Section: Methodsmentioning
confidence: 99%