2014
DOI: 10.1371/journal.pcbi.1003788
|View full text |Cite
|
Sign up to set email alerts
|

Defining the Estimated Core Genome of Bacterial Populations Using a Bayesian Decision Model

Abstract: The bacterial core genome is of intense interest and the volume of whole genome sequence data in the public domain available to investigate it has increased dramatically. The aim of our study was to develop a model to estimate the bacterial core genome from next-generation whole genome sequencing data and use this model to identify novel genes associated with important biological functions. Five bacterial datasets were analysed, comprising 2096 genomes in total. We developed a Bayesian decision model to estima… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
51
0
2

Year Published

2015
2015
2020
2020

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 57 publications
(56 citation statements)
references
References 39 publications
(41 reference statements)
3
51
0
2
Order By: Relevance
“…This has provided numerous novel opportunities for studying Campylobacter (Farhat et al 2014;MĂ©ric et al 2014;Van Tonder et al 2014). However, as correctly determined sequence data are absolute, the insights obtained from MLST-based analyses remain relevant and MLST data are "forward compatible" with NGS data .…”
Section: The Impact Of Sequence-based Molecular Typingmentioning
confidence: 99%
“…This has provided numerous novel opportunities for studying Campylobacter (Farhat et al 2014;MĂ©ric et al 2014;Van Tonder et al 2014). However, as correctly determined sequence data are absolute, the insights obtained from MLST-based analyses remain relevant and MLST data are "forward compatible" with NGS data .…”
Section: The Impact Of Sequence-based Molecular Typingmentioning
confidence: 99%
“…PPanGGOLiN uses a new statistical model to classify gene families into persistent, cloud, and one or several shell partitions. Unlike the few statistical methods available [24,25,26] that partitions gene families using only their frequency, our method combines the information of occurrence of gene families and the pangenome graph to make the classification. In the following sections we present an overview of the method, an illustration of a pangenome graph and then the partitioning of a large set of prokaryotic species from GenBank.…”
Section: Introductionmentioning
confidence: 99%
“…This purely bioinformatic approach may be only the first step in elucidating genes of interest, as subsequent experimentation aimed at characterizing the role of the identified target in pathogenesis, the extent to which they are expressed during infection, and their essentiality can provide additional information necessary for identifying high value targets. The availability of numerous genome sequences for a given bacterial species permits the determination of the species’ core genome [24,25], those genes that are present in all, or the majority, of sequenced strains. Based on the idea that the presence of these genes in the majority of strains indicates that they are likely to participate in key bacterial processes, the elucidation of core genomes can provide an initial step for antibiotic target identification.…”
Section: Comparative Genomicsmentioning
confidence: 99%