2021
DOI: 10.1128/msystems.01305-20
|View full text |Cite
|
Sign up to set email alerts
|

GAUGE-Annotated Microbial Transcriptomic Data Facilitate Parallel Mining and High-Throughput Reanalysis To Form Data-Driven Hypotheses

Abstract: The NCBI Gene Expression Omnibus (GEO) provides tools to query and download transcriptomic data. However, less than 4% of microbial experiments include the sample group annotations required to assess differential gene expression for high-throughput reanalysis, and data deposited after 2014 universally lack these annotations. Our algorithm GAUGE (general annotation using text/data group ensembles) automatically annotates GEO microbial data sets, including microarray and RNA sequencing studies, increasing the pe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3
3

Relationship

4
2

Authors

Journals

citations
Cited by 12 publications
(15 citation statements)
references
References 35 publications
(41 reference statements)
0
14
0
Order By: Relevance
“…Compared to prior work using manually curated datasets, which required laborious manual grouping 6,7,17 , SOPHIE demonstrates consistent results but using an automated process. In short, SOPHIE identifies the same common patterns but in a fast and scalable way.…”
Section: Discussionmentioning
confidence: 66%
See 2 more Smart Citations
“…Compared to prior work using manually curated datasets, which required laborious manual grouping 6,7,17 , SOPHIE demonstrates consistent results but using an automated process. In short, SOPHIE identifies the same common patterns but in a fast and scalable way.…”
Section: Discussionmentioning
confidence: 66%
“…Finally, when we extended this analysis to a different organism, P. aeruginosa , we observed the same concordance (R 2 = 0.449) between SOPHIE-generated percentiles compared to those generated using a manually validated dataset, GAPE (Figure 2D). 16 GAPE contained a collection of 73 array experiments from the GPL84 platform. We found a significant over-representation (p=1e-139) of SOPHIE identified common DEGs within the GAPE set of common DEGs.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…However, none are specifically geared towards CF pathogen research, and there is room to expand on their functionality [Table 2]. Our own lab has previously published tools to make publicly available data more accessible to CF researchers 29,30 , but these tools focus on the most commonly studied CF pathogensnamely Pseudomonas aeruginosa and Staphylococcus aureus -and don't include data sets on many of the other clinically relevant species listed in Table 1. Building on our prior work, we present the R Shiny web application CF-Seq.…”
Section: Figure 1 Landscape Of Rna-sequencing Studies Available In Th...mentioning
confidence: 99%
“…The P. aeruginosa community has long supported the development and widespread use of databases, hubs and analysis tools, such as The Pseudomonas Genome Database (10), BACTOME (11), the International Pseudomonas Consortium Database (12), the Pseudomonas aeruginosa metabolome database (13), the Pseudomonas aeruginosa transcriptome viewer (14), and the shiny app with algorithmically annotated datasets, GAPE (15). Tools have also been developed that utilize public data from across many experiments in concert, such as the ADAGE web server which enables the exploration of P. aeruginosa microarray data after processing by a machine learning algorithm (16).…”
Section: Introductionmentioning
confidence: 99%