2018
DOI: 10.1371/journal.pgen.1007758
|View full text |Cite
|
Sign up to set email alerts
|

A fast and agnostic method for bacterial genome-wide association studies: Bridging the gap between k-mers and genetic events

Abstract: Genome-wide association study (GWAS) methods applied to bacterial genomes have shown promising results for genetic marker discovery or detailed assessment of marker effect. Recently, alignment-free methods based on k-mer composition have proven their ability to explore the accessory genome. However, they lead to redundant descriptions and results which are sometimes hard to interpret. Here we introduce DBGWAS, an extended k-mer-based GWAS method producing interpretable genetic variants associated with distinct… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
220
2

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 165 publications
(222 citation statements)
references
References 73 publications
(102 reference statements)
0
220
2
Order By: Relevance
“…We also found difficulty in reconciling gene family groupings produced by automated gene clustering, which often split known resistance determinants into multiple families, weakening their association with phenotype and often strengthening their association with specific lineages. The use of unitigs (Jaillard et al 2018) and read-mapping approaches (Hunt et al 2017) can address these shortcomings, but interpretation of results is less clear-cut.…”
Section: Discussionmentioning
confidence: 99%
“…We also found difficulty in reconciling gene family groupings produced by automated gene clustering, which often split known resistance determinants into multiple families, weakening their association with phenotype and often strengthening their association with specific lineages. The use of unitigs (Jaillard et al 2018) and read-mapping approaches (Hunt et al 2017) can address these shortcomings, but interpretation of results is less clear-cut.…”
Section: Discussionmentioning
confidence: 99%
“…We thus converted the transformation scores into a binary phenotype (transformable T, score 1-3 or nontransformable NT, score 0-0.5) ( Fig. S1A) and conducted a genome-wide association studies (GWAS) using DBGWAS (42). DBGWAS provides a graphical output that can visually distinguish genetic determinants associating with phenotypes that correspond to single-nucleotide polymorphism (SNP) or to horizontal gene transfer (HGT) events.…”
Section: Gwas Associates the Sparsely Distributed Conjugative Plasmidmentioning
confidence: 99%
“…We also follow the method used in DBGWAS (39) , which after counting fixed-length k-mers constructs a de Bruijn graph of the population. Nodes in this graph are extensions of k-mers with the same population frequency vector, and whose sequence is referred to as unitigs.…”
Section: Efficiently Modelling the Entire Pan-genomementioning
confidence: 99%
“…We follow the same method as step 1 of the DBGWAS package, which uses the GATB library to construct a compressed de Bruijn graph (40) , and then report frequency vectors of each unitig/node and unique pattern in a format readable by pyseer. We use a k-mer length of 31 throughout to count unitigs, as this was previously shown to maximise association power (39) . This length can also be set by the user.…”
Section: Efficiently Modelling the Entire Pan-genomementioning
confidence: 99%