2014
DOI: 10.1186/s12859-014-0351-9
|View full text |Cite
|
Sign up to set email alerts
|

Optimising parallel R correlation matrix calculations on gene expression data using MapReduce

Abstract: BackgroundHigh-throughput molecular profiling data has been used to improve clinical decision making by stratifying subjects based on their molecular profiles. Unsupervised clustering algorithms can be used for stratification purposes. However, the current speed of the clustering algorithms cannot meet the requirement of large-scale molecular data due to poor performance of the correlation matrix calculation. With high-throughput sequencing technologies promising to produce even larger datasets per subject, we… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
6
2
2

Relationship

1
9

Authors

Journals

citations
Cited by 23 publications
(9 citation statements)
references
References 17 publications
(15 reference statements)
0
9
0
Order By: Relevance
“…R package corrplot was selected to estimate the Pearson correlation to illuminate the co-expression relationship of key genes at the transcriptional level [ 24 ]. The 11.0 version of STRING ( https://www.string-db.org ) was chosen to investigate and generate the PPI network among key genes.…”
Section: Methodsmentioning
confidence: 99%
“…R package corrplot was selected to estimate the Pearson correlation to illuminate the co-expression relationship of key genes at the transcriptional level [ 24 ]. The 11.0 version of STRING ( https://www.string-db.org ) was chosen to investigate and generate the PPI network among key genes.…”
Section: Methodsmentioning
confidence: 99%
“…Genomic information from gene expression data has been widely used and already benefited on improving clinical decision and molecular profiling based patient stratification. Clustering methods, as well as their corresponding HPC-based solutions [40], are adopted to classify the highdimensional gene expression sequences into some known patterns, which indicates that the number of targeted clustering centroids are determined in advance. As we all know, there are still large numbers of gene expression sequences, among which the patterns are not yet discovered.…”
Section: Auto-clustering On Real Applicationmentioning
confidence: 99%
“…R package corrplot was selected to estimate the Pearson correlation to illuminate the co-expression relationship of key genes at the transcriptional level [18]. The 11.0 version of STRING (https://www.stringdb.org) was chose to investigate and generate the PPI network among key genes.…”
Section: Co-expression Analysis and Protein-protein Interaction (Ppi)mentioning
confidence: 99%