The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2021
DOI: 10.1101/2021.05.26.445885
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Optimal dimensionality selection for independent component analysis of transcriptomic data

Abstract: Independent Component Analysis (ICA) is an unsupervised machine learning algorithm that separates a set of mixed signals into a set of statistically independent source signals. Applied to high-quality gene expression datasets, ICA effectively reveals the source signals of the transcriptome as groups of co-regulated genes and their corresponding activities across diverse growth conditions. Two major variables that affect the output of ICA are the diversity and scope of the underlying data, and the user-defined … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
5

Relationship

5
0

Authors

Journals

citations
Cited by 10 publications
(13 citation statements)
references
References 24 publications
0
13
0
Order By: Relevance
“…Five additional iModulons were dominated by a single, high-coefficient gene, and are automatically identified by the method find_single_gene_imodulons . These Single Gene (SG) iModulons may arise from over-decomposition of the dataset 30,37 or artificial knock-out or overexpression of single genes. Together, these iModulons contribute to 1% of the variance in the dataset.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Five additional iModulons were dominated by a single, high-coefficient gene, and are automatically identified by the method find_single_gene_imodulons . These Single Gene (SG) iModulons may arise from over-decomposition of the dataset 30,37 or artificial knock-out or overexpression of single genes. Together, these iModulons contribute to 1% of the variance in the dataset.…”
Section: Resultsmentioning
confidence: 99%
“…To compute the optimal independent components, an extension of ICA was performed on the RNA-seq dataset as described in McConn et al 37 .…”
Section: Methodsmentioning
confidence: 99%
“…The final dataset was composed of 657 samples, spanning various conditions that describe M. tuberculosis's response to various nutrient sources, stressors, antibiotics, and virulence events. After the final dataset was obtained, a previously developed ICA algorithm was used to decompose the data into 80 robust iModulons [10] (Figure 1b).…”
Section: Independent Component Analysis Of Publicly Available Data Reveals 80 Transcriptional Modules For M Tuberculosismentioning
confidence: 99%
“…'Uncharacterized' iModulons are those which had little overlap with known TFs or knowledge types, but still contained a significant number of genes. Finally, 'Single Gene' iModulons are those that track the expression of a single gene, and are treated as an artifact of the ICA decomposition [10].…”
Section: Independent Component Analysis Of Publicly Available Data Reveals 80 Transcriptional Modules For M Tuberculosismentioning
confidence: 99%
See 1 more Smart Citation