2019
DOI: 10.1101/551234
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Alpaca: a kmer-based approach for investigating mosaic structures in microbial genomes

Abstract: Microbial genomes are often mosaic: different regions can possess different evolutionary origins due to genetic recombination. The recent feasibility to assemble microbial genomes completely and the availability of sequencing data for complete microbial populations, means that researchers can now investigate the potentially rich evolutionary history of a microbe at a much higher resolution. Here, we present Alpaca: a method to investigate mosaicism in microbial genomes based on kmer similarity of large sequenc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
15
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 6 publications
(15 citation statements)
references
References 11 publications
0
15
0
Order By: Relevance
“…Therefore, we developed Alpaca: a simple and computationally inexpensive method to investigate complex non-linear ancestry via comparison of sequencing datasets (61). Alpaca is based on short-read alignment of a collection of strains to a partitioned reference genome, in which the similarity of each partition to the collection of strains is independently computed using k-mer sets (61). Reducing the alignments in each partition to k-mer sets prior to similarity analysis is computationally inexpensive.…”
Section: Resultsmentioning
confidence: 99%
See 4 more Smart Citations
“…Therefore, we developed Alpaca: a simple and computationally inexpensive method to investigate complex non-linear ancestry via comparison of sequencing datasets (61). Alpaca is based on short-read alignment of a collection of strains to a partitioned reference genome, in which the similarity of each partition to the collection of strains is independently computed using k-mer sets (61). Reducing the alignments in each partition to k-mer sets prior to similarity analysis is computationally inexpensive.…”
Section: Resultsmentioning
confidence: 99%
“…Reducing the alignments in each partition to k-mer sets prior to similarity analysis is computationally inexpensive. Phylogenetic relationships are also not recalculated, but simply inferred from previously available information on the population structure of the collection of strains (61). The partitioning of the reference genome enables the identification of strains with high similarity to different regions of the genome, enabling the identification of ancestry resulting from non-linear evolution.…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations