2015
DOI: 10.1099/mgen.0.000025
|View full text |Cite
|
Sign up to set email alerts
|

K-Pax2: Bayesian identification of cluster-defining amino acid positions in large sequence datasets

Abstract: The recent growth in publicly available sequence data has introduced new opportunities for studying microbial evolution and spread. Because the pace of sequence accumulation tends to exceed the pace of experimental studies of protein function and the roles of individual amino acids, statistical tools to identify meaningful patterns in protein diversity are essential. Large sequence alignments from fast-evolving micro-organisms are particularly challenging to dissect using standard tools from phylogenetics and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
21
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
7
1

Relationship

5
3

Authors

Journals

citations
Cited by 14 publications
(21 citation statements)
references
References 34 publications
0
21
0
Order By: Relevance
“…The remaining 8,679 genes were extracted from the pan-genome to create an accessory genome matrix for all 228 genomes. The genomes were then clustered based on their accessory gene content using the Bayesian clustering analysis tool K-Pax2 [ 22 ], resulting in 17 distinct clusters of isolates ( Fig 2 ). Each of these clusters show an association with the type of CTX-M gene carried in agreement with recent data analyzing solely plasmid sequences [ 23 ].…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The remaining 8,679 genes were extracted from the pan-genome to create an accessory genome matrix for all 228 genomes. The genomes were then clustered based on their accessory gene content using the Bayesian clustering analysis tool K-Pax2 [ 22 ], resulting in 17 distinct clusters of isolates ( Fig 2 ). Each of these clusters show an association with the type of CTX-M gene carried in agreement with recent data analyzing solely plasmid sequences [ 23 ].…”
Section: Resultsmentioning
confidence: 99%
“…A pan-genome of the ST131 data set was constructed using LS-BSR [ 21 ], and a matrix of accessory gene presence/absence for each genome constructed using the filter_BSR_variome.py tool. The resulting accessory genome matrix was used to identify clusters of isolates based on their accessory gene content via Bayesian clustering using Kpax2 [ 22 ]. Five independent runs from different starting configurations under the default prior settings and upper bound values for the number of clusters in the interval 30–50 were performed.…”
Section: Methodsmentioning
confidence: 99%
“…KPAX2 is a new Bayesian method for identifying evolutionary signals in amino acid sequences that relate to differential evolution of lineages that may be either monophyletic or polyphyletic, for example, resulting from the horizontal distribution of relevant genomic elements through recombination ( Pessia et al. 2015 ).…”
Section: Methodsmentioning
confidence: 99%
“…KPAX2 software was used to cluster the strains on the basis of their CRISPR spacer profiles [ 22 ]. Input to the software was a binary matrix with columns representing an absence/presence variable for each of the 2969 spacers in each detected CRISPR cassette.…”
Section: Methodsmentioning
confidence: 99%