2021
DOI: 10.1186/s12864-021-07471-y
|View full text |Cite
|
Sign up to set email alerts
|

Exploring short k-mer profiles in cells and mobile elements from Archaea highlights the major influence of both the ecological niche and evolutionary history

Abstract: Background K-mer-based methods have greatly advanced in recent years, largely driven by the realization of their biological significance and by the advent of next-generation sequencing. Their speed and their independence from the annotation process are major advantages. Their utility in the study of the mobilome has recently emerged and they seem a priori adapted to the patchy gene distribution and the lack of universal marker genes of viruses and plasmids. To provide a framewor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 78 publications
0
6
0
Order By: Relevance
“…Efficient counting of k -mers is the basis for k -mer based statistical tools. In recent years, a variety of applications with many methods to count k -mers were developed ( Crusoe et al, 2015 ; Bize et al, 2021 ; Cattaneo et al, 2022 ). Given the fact that the most useful k for alignment-free phylogenetics is relatively small, typically <10, we implemented a neat yet ultra-efficient algorithm to count k -mer frequency (see Supplementary Figure 2 for an illustrated example), which mathematically transforms a k -mer to its index based on the powers of 4.…”
Section: Methodsmentioning
confidence: 99%
“…Efficient counting of k -mers is the basis for k -mer based statistical tools. In recent years, a variety of applications with many methods to count k -mers were developed ( Crusoe et al, 2015 ; Bize et al, 2021 ; Cattaneo et al, 2022 ). Given the fact that the most useful k for alignment-free phylogenetics is relatively small, typically <10, we implemented a neat yet ultra-efficient algorithm to count k -mer frequency (see Supplementary Figure 2 for an illustrated example), which mathematically transforms a k -mer to its index based on the powers of 4.…”
Section: Methodsmentioning
confidence: 99%
“…k-mer frequency is another important variable that was shown to strongly correlate with codon usage frequencies 36 . To test how k-mer frequencies could influence the codon and AA usage distances of our test samples we estimated frequencies of different short k-mers starting from k = 2 to k = 10 (except k = 3) (details in material and methods) and correlated the distances in k-mer usage frequencies with our previously estimated codon/AA usage distances among the test samples.…”
Section: Resultsmentioning
confidence: 99%
“…GC content is another crucial variable that can shape the codon usage pattern of microorganisms just like gene expression 7 , 34 , 35 , 46 . Recent studies suggested that despite their compositional diversity environmental samples show distinctive distributions in their oligonucleotide signature 36 , 47 . Consequently, k-mer-based comparison of sequence signatures gained wide popularity to evaluate relationships between different metagenomic samples 47 .…”
Section: Discussionmentioning
confidence: 99%
“…Gaussian mixture models using two distributions were fitted 129 to the k-mer content of all windows, to classify these as belonging to either the core genome or transferred elements. This deviation in k-mer spectra has been explored in the context of the archaeal mobilome and contains information on the ecological niche and evolutionary history of DNA sequences 130 .…”
Section: Methodsmentioning
confidence: 99%