2022
DOI: 10.1093/nar/gkac1078
|View full text |Cite
|
Sign up to set email alerts
|

proGenomes3: approaching one million accurately and consistently annotated high-quality prokaryotic genomes

Abstract: The interpretation of genomic, transcriptomic and other microbial ‘omics data is highly dependent on the availability of well-annotated genomes. As the number of publicly available microbial genomes continues to increase exponentially, the need for quality control and consistent annotation is becoming critical. We present proGenomes3, a database of 907 388 high-quality genomes containing 4 billion genes that passed stringent criteria and have been consistently annotated using multiple functional and taxonomic … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
5

Relationship

2
8

Authors

Journals

citations
Cited by 28 publications
(22 citation statements)
references
References 61 publications
0
18
0
Order By: Relevance
“…Classification of rdhA -containing contigs as belonging to chromosomes, plasmids or viruses was performed using Genomad v.1.5.0 with default parameters 100 . Integrons, integrative conjugative element (ICEs), IS elements and transposons were identified using HMM searches of the proteins against the 68 marker HMM profiles by default, which are available on proMGE (http://promge.embl.de/) 101 .…”
Section: Methodsmentioning
confidence: 99%
“…Classification of rdhA -containing contigs as belonging to chromosomes, plasmids or viruses was performed using Genomad v.1.5.0 with default parameters 100 . Integrons, integrative conjugative element (ICEs), IS elements and transposons were identified using HMM searches of the proteins against the 68 marker HMM profiles by default, which are available on proMGE (http://promge.embl.de/) 101 .…”
Section: Methodsmentioning
confidence: 99%
“…Assemblies, including gene annotations, were downloaded from RefSeq or Genbank. Although we ran GUNC on some assemblies ourselves, for the most part, we relied on previous classifications of assemblies as being non-chimeric, either from the GUNC website (https://grp-bork.embl-community.io/gunc/datasets.html) or from proGenomes3 (Fullam et al 2023).…”
Section: Methodsmentioning
confidence: 99%
“…The GTDB database provides the phylogenetic tree in plain Newick format, together with a massive data table including hundreds of columns informing different aspects of the genomes used to build the tree. Moreover, the ProGenomes database (Fullam et al, 2023; Parks et al, 2021) provides habitat information for many of the species included in the GTDB tree.…”
Section: Use Case Examplesmentioning
confidence: 99%