2018
DOI: 10.1111/tpj.13994
|View full text |Cite
|
Sign up to set email alerts
|

Pinaceae show elevated rates of gene turnover that are robust to incomplete gene annotation

Abstract: Gene duplications and gene losses are major determinants of genome evolution and phenotypic diversity. The frequency of gene turnover (gene gains and gene losses combined) is known to vary between organisms. Comparative genomic analyses of gene families can highlight such variation; however, estimates of gene turnover may be biased when using highly fragmented genome assemblies resulting in poor gene annotations. Here, we address potential biases introduced by gene annotation errors in estimates of gene turnov… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

2
10
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 13 publications
(12 citation statements)
references
References 101 publications
2
10
0
Order By: Relevance
“…After removing 21 713 putative misannotations from OGF, corresponding to approximately 11% of the total genes, the test dataset F50 yielded a slightly lower λ score of +0.0049. OGF and filtered F50 estimates were similar, supporting the notion that the bias of draft genome assemblies on the estimate of λ is marginal, as reported by (Casola & Koralewski, 2018). The number of significantly expanding/contracting gene families was reduced by approximately 30% in the F50 dataset comparing to the OGF dataset, thus indicating the contribution of a number of putative misannotations in the analysis of gene families turnover on individual gene families.…”
Section: Methodssupporting
confidence: 86%
“…After removing 21 713 putative misannotations from OGF, corresponding to approximately 11% of the total genes, the test dataset F50 yielded a slightly lower λ score of +0.0049. OGF and filtered F50 estimates were similar, supporting the notion that the bias of draft genome assemblies on the estimate of λ is marginal, as reported by (Casola & Koralewski, 2018). The number of significantly expanding/contracting gene families was reduced by approximately 30% in the F50 dataset comparing to the OGF dataset, thus indicating the contribution of a number of putative misannotations in the analysis of gene families turnover on individual gene families.…”
Section: Methodssupporting
confidence: 86%
“…Hundreds to thousands of gene CNVs have been previously discovered in loblolly pine and several spruce species (Neves et al ., ; Prunier et al ., ), an exciting preview of the wealth of gene variants in natural conifer populations that the analysis of WGS data will be able to uncover. These findings are also in agreement with the result of a gene family evolution study showing high levels of gene duplication and gene loss in pine trees (Casola & Koralewski, ). While genotyping arrays, no matter how large, can only inform on the frequency of known variants, WGS data will also allow researchers to capture novel rare variants, which we now appreciate to be commonly represented among SNPs associated with phenotypic variation in loblolly pine.…”
supporting
confidence: 92%
“…To maximize the probability of achieving convergence in the maximum likelihood analysis performed in CAFE, OGs were processed to remove OGs present in only a few species and were subsequently divided into OGs having <100 gene copies in any species (‘small’ OGs) and orthogroups having one or more species with ≥100 gene copies (‘large’ OGs); see ‘Known Limitations’ section in CAFE 4.0 Manual of March 14, 2017 and section 2.2.4 of the CAFE 4.0 tutorial online at https://iu.app.box.com/v/cafetutorial-pdf , and also Casola and Koralewski, 2018 . We retained 6,496 OGs that occurred in no less than 10 out of 18 species consisting of 6,467 ‘small’ OGs and 29 ‘large’ OGs.…”
Section: Methodsmentioning
confidence: 99%