2020
DOI: 10.1371/journal.pcbi.1007732
|View full text |Cite|
|
Sign up to set email alerts
|

PPanGGOLiN: Depicting microbial diversity via a partitioned pangenome graph

Abstract: The use of comparative genomics for functional, evolutionary, and epidemiological studies requires methods to classify gene families in terms of occurrence in a given species. These methods usually lack multivariate statistical models to infer the partitions and the optimal number of classes and don't account for genome organization. We introduce a graph structure to model pangenomes in which nodes represent gene families and edges represent genomic neighborhood. Our method, named PPanGGOLiN, partitions nodes … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
138
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 122 publications
(142 citation statements)
references
References 56 publications
2
138
0
Order By: Relevance
“…Since corresponding proteins are expected to be broadly conserved, we determined the partition of the 391 essential genes between conserved and variable genomes in meningococci. To do this, we first used the recently described PPanGGOLiN software 21 As for similar efforts in other bacteria 10,12,13 , we first selected protein-coding genes to be targeted by systematic mutagenesis, excluding 85 genes (4.1%, highlighted in black in the first pie chart) because they encode transposases of repeated insertion sequences, or correspond to short remnants of truncated genes or cassettes (Supplementary Data 1). We then followed a two-step mutagenesis approach explained in the text and in Supplementary Fig.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Since corresponding proteins are expected to be broadly conserved, we determined the partition of the 391 essential genes between conserved and variable genomes in meningococci. To do this, we first used the recently described PPanGGOLiN software 21 As for similar efforts in other bacteria 10,12,13 , we first selected protein-coding genes to be targeted by systematic mutagenesis, excluding 85 genes (4.1%, highlighted in black in the first pie chart) because they encode transposases of repeated insertion sequences, or correspond to short remnants of truncated genes or cassettes (Supplementary Data 1). We then followed a two-step mutagenesis approach explained in the text and in Supplementary Fig.…”
Section: Resultsmentioning
confidence: 99%
“…The genome of strain 8013 and a total of 108 complete genomes from the RefSeq database 64 (last accessed June 8th, 2020) (Supplementary Data 5) was used to compute the N. meningitidis pangenome using the PPanGGOLiN software 21 (version 1.1.85). The original annotations of the genomes have been kept in order to compute gene families.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Previous approaches which aid in the inference of the pangenome of a collection of bacterial isolates include Roary, OrthoMCL, PanOCT, PIRATE, PanX, PGAP, COGsoft, MultiParanoid, PPanGGoLiN and MetaPGN [4][5][6][7][8][9][10][11][12]. The majority of methods for determining the pangenome tend to make use of one of two similar approaches (see Supplementary Figure 1).…”
Section: Introductionmentioning
confidence: 99%
“…A final step of some pipelines is to classify the resulting clusters into core and accessory categories based upon their prevalence within the dataset. This is usually done using predefined thresholds; however, more recently model-based extensions to this approach have been suggested [11].…”
Section: Introductionmentioning
confidence: 99%