2016
DOI: 10.1128/msystems.00025-15
|View full text |Cite
|
Sign up to set email alerts
|

ADAGE-Based Integration of Publicly Available Pseudomonas aeruginosa Gene Expression Data with Denoising Autoencoders Illuminates Microbe-Host Interactions

Abstract: The quantity and breadth of genome-scale data sets that examine RNA expression in diverse bacterial and eukaryotic species are increasing more rapidly than for curated knowledge. Our ADAGE method integrates such data without requiring gene function, gene pathway, or experiment labeling, making practical its application to any large gene expression compendium. We built a Pseudomonas aeruginosa ADAGE model from a diverse set of publicly available experiments without any prespecified biological knowledge, and thi… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

4
169
0
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
2
1
1

Relationship

2
6

Authors

Journals

citations
Cited by 126 publications
(182 citation statements)
references
References 63 publications
4
169
0
1
Order By: Relevance
“…Reconstructing gene expression input data using autoencoder frameworks has been previously shown to reveal novel biological patterns. [7][8][9] VAEs and GANs are generative models, which means they learn to approximate a data generating distribution. Through approximation and compression, the models have been shown to capture an underlying data manifold -a constrained, lower dimensional space where data is distributed -and disentangle sources of variation from different classes of data.…”
Section: Introductionmentioning
confidence: 99%
“…Reconstructing gene expression input data using autoencoder frameworks has been previously shown to reveal novel biological patterns. [7][8][9] VAEs and GANs are generative models, which means they learn to approximate a data generating distribution. Through approximation and compression, the models have been shown to capture an underlying data manifold -a constrained, lower dimensional space where data is distributed -and disentangle sources of variation from different classes of data.…”
Section: Introductionmentioning
confidence: 99%
“…This finding supports using mRNA as a summary measurement capable of capturing system-wide responses to molecular events beyond transcription factor alterations. Machine learning has been applied to gene expression in a variety of studies with various goals [37][38][39][40][41]. In a similar study, Guinney et al trained a classifier to model RAS activity in colorectal cancer and demonstrated its clinical utility by predicting response to MEK inhibitors and anti-EGFR based treatments [18].…”
Section: Discussionmentioning
confidence: 99%
“…the proposed DGS method consists of four steps: 1) Similar to the methods of (Danaee et al, 2016;Tan et al, 2015Tan et al, , 2016Tan et al, , 2017, a DAE is trained with a large unlabelled gene expression dataset to extract salient features. The output of this step is the learned parameters to be transferred to the next DAE.…”
Section: Methodsmentioning
confidence: 99%
“…3) After training the second DAE, genes with high weights in this DAE are selected based on a standard deviation filter on their connectivity weights. This selection approach is similar to (Danaee et al, 2016;Tan et al, 2015Tan et al, , 2016Tan et al, , 2017Way and Greene, 2018). The idea of this gene selection step is to provide the classifier with a few rich genes with the strongest signals based on both the labelled and unlabelled datasets.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation