2020
DOI: 10.1093/molbev/msaa224
|View full text |Cite
|
Sign up to set email alerts
|

CoreCruncher: Fast and Robust Construction of Core Genomes in Large Prokaryotic Data Sets

Abstract: The core genome represents the set of genes shared by all, or nearly all, strains of a given population or species of prokaryotes. Inferring the core genome is integral to many genomic analyses, however, most methods rely on the comparison of all the pairs of genomes; a step that is becoming increasingly difficult given the massive accumulation of genomic data. Here, we present CoreCruncher; a program that robustly and rapidly constructs core genomes across hundreds or thousands of genomes. CoreCruncher does n… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 22 publications
(18 citation statements)
references
References 29 publications
0
16
0
Order By: Relevance
“…We used the resulting median dN/dS of our representative genomes for further analysis. In order to examine the effect of genes’ selection on final dN/dS estimations, we randomly selected 40 genera and identified their core genes using CoreCruncher [ 84 ] using usearch [ 85 ] and the default parameters except for -score 80. We estimated the pairwise dN/dS for each core gene using the approach described previously and estimated the median dN/dS for our genus-representative genomes.…”
Section: Methodsmentioning
confidence: 99%
“…We used the resulting median dN/dS of our representative genomes for further analysis. In order to examine the effect of genes’ selection on final dN/dS estimations, we randomly selected 40 genera and identified their core genes using CoreCruncher [ 84 ] using usearch [ 85 ] and the default parameters except for -score 80. We estimated the pairwise dN/dS for each core gene using the approach described previously and estimated the median dN/dS for our genus-representative genomes.…”
Section: Methodsmentioning
confidence: 99%
“…Core and accessory genomic fragments were identified from the Prokka annotated genomic sequences (.ffn files) using Spine v0.3.1 (http://vfsmspineagent.fsm.northwestern.edu/index_age.html) [38]. A range of other defined parameters (70–100 % similarity and present in 50–100 % of genomes) were evaluated, with 90 % [38–40], and the default value of 100 % core [41–43] definitions, using the default value of 85 % identity. Core values of 100 and 85% identity were subsequently used as the pangenomics parameters.…”
Section: Methodsmentioning
confidence: 99%
“…html) [38]. A range of other defined parameters (70-100 % similarity and present in 50-100 % of genomes) were evaluated, with 90 % [38][39][40], and the default value of 100 % core [41][42][43] definitions, using the default value of 85 % identity. Core values of 100 and 85% identity were subsequently used as the pangenomics parameters.…”
Section: Pangenomicsmentioning
confidence: 99%
“…https://plants.ensembl.org/Triticum_aestivum/Info/Index (accessed on 1 May 2022). The core genes were then identified using the core cruncher with the default parameters [29]. The detailed parameters are python corecruncher_master.py -in input_folder -out out-put_folder -length 80% -score 90%.…”
Section: Identification Of the Core Genome Of Common Wheatmentioning
confidence: 99%