2014
DOI: 10.1016/j.compbiolchem.2014.08.021
|View full text |Cite
|
Sign up to set email alerts
|

On K-peptide length in composition vector phylogeny of prokaryotes

Abstract: Using an enlarged alphabet of K-tuples is the way to carry out alignment-free comparison of genomes in the composition vector (CV) approach to prokaryotic phylogeny. We summarize the known aspects concerning the choice of K and examine the results of using CVs with subtraction of a statistical background for K=3-9 and using raw CVs without subtraction for K=1-12. The criterion for evaluation consists in direct comparison with taxonomy. For prokaryotes the best performances are obtained for K=5 and 6 with subtr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0
2

Year Published

2014
2014
2021
2021

Publication Types

Select...
7
2

Relationship

4
5

Authors

Journals

citations
Cited by 17 publications
(23 citation statements)
references
References 23 publications
0
21
0
2
Order By: Relevance
“…A list of currently accepted allergenic proteins from the wellstudied A. fumigatus can be extracted from the Allergome database (www.allergome.org). The sequences of the allergenic proteins from this list were compared with the annotated A. novofumigatus protein list using BLAST+, with parameters set to report full-length (19,20). The time scale has been scaled to the root, thereby making the branch lengths relative to the distance between the root and the species.…”
Section: Secondary Metabolite Profile Of a Novofumigatus Compared Withmentioning
confidence: 99%
“…A list of currently accepted allergenic proteins from the wellstudied A. fumigatus can be extracted from the Allergome database (www.allergome.org). The sequences of the allergenic proteins from this list were compared with the annotated A. novofumigatus protein list using BLAST+, with parameters set to report full-length (19,20). The time scale has been scaled to the root, thereby making the branch lengths relative to the distance between the root and the species.…”
Section: Secondary Metabolite Profile Of a Novofumigatus Compared Withmentioning
confidence: 99%
“…Fragmentation based on k-tuple or k-word searches for a word of length k. This technique is previously used in one of the oldest and fastest pairwise alignment methods: FASTA (Lipman and Pearson, 1985). Due to its simplicity and speed, the k-tuple could be enough in molecular phylogeny and taxonomy without the need for alignment in the future (Zuo et al, 2014). As the word length (the k value) increases, the accuracy of the match between the two words also increases.…”
Section: Fragmentation: Table Of K-tuplesmentioning
confidence: 98%
“…The subtraction procedure is crucial for success of the method, but we skip the mathematical details, as these can be found in previous publications, for example, in [15,16] and [20]. We indicate that the key formula of the subtraction procedure may be derived in two independent ways, either by using the relation between joint probability and conditional probability [15,16] or by applying the maximal entropy principle [27].…”
Section: The Cvtree Approachmentioning
confidence: 99%
“…The abundance of genomic data enables the transition from comparing methodological suggestions to devising practical tools for bench microbiologists. In this chapter, we review our decade-long effort [15][16][17][18][19][20] to develop a wholegenome-based and alignment-free Composition Vector Tree (CVTree) approach and demonstrate the companion CVTree Web Server.…”
Section: Introductionmentioning
confidence: 99%