Background Plastome (plastid genome) sequences provide valuable information for understanding the phylogenetic relationships and evolutionary history of plants. Although the rapid development of high-throughput sequencing technology has led to an explosion of plastome sequences, annotation remains a significant bottleneck for plastomes. User-friendly batch annotation of multiple plastomes is an urgent need. Results We introduce Plastid Genome Annotator (PGA), a standalone command line tool that can perform rapid, accurate, and flexible batch annotation of newly generated target plastomes based on well-annotated reference plastomes. In contrast to current existing tools, PGA uses reference plastomes as the query and unannotated target plastomes as the subject to locate genes, which we refer to as the reverse query-subject BLAST search approach. PGA accurately identifies gene and intron boundaries as well as intron loss. The program outputs GenBank-formatted files as well as a log file to assist users in verifying annotations. Comparisons against other available plastome annotation tools demonstrated the high annotation accuracy of PGA, with little or no post-annotation verification necessary. Likewise, we demonstrated the flexibility of reference plastomes within PGA by annotating the plastome of Rosa roxburghii using that of Amborella trichopoda as a reference. The program, user manual and example data sets are freely available at https://github.com/quxiaojian/PGA . Conclusions PGA facilitates rapid, accurate, and flexible batch annotation of plastomes across plants. For projects in which multiple plastomes are generated, the time savings for high-quality plastome annotation are especially significant.
Fruits are the defining feature of angiosperms, likely have contributed to angiosperm successes by protecting and dispersing seeds, and provide foods to humans and other animals, with many morphological types and important ecological and agricultural implications. Rosaceae is a family with ∼3000 species and an extraordinary spectrum of distinct fruits, including fleshy peach, apple, and strawberry prized by their consumers, as well as dry achenetum and follicetum with features facilitating seed dispersal, excellent for studying fruit evolution. To address Rosaceae fruit evolution and other questions, we generated 125 new transcriptomic and genomic datasets and identified hundreds of nuclear genes to reconstruct a well-resolved Rosaceae phylogeny with highly supported monophyly of all subfamilies and tribes. Molecular clock analysis revealed an estimated age of ∼101.6 Ma for crown Rosaceae and divergence times of tribes and genera, providing a geological and climate context for fruit evolution. Phylogenomic analysis yielded strong evidence for numerous whole genome duplications (WGDs), supporting the hypothesis that the apple tribe had a WGD and revealing another one shared by fleshy fruit-bearing members of this tribe, with moderate support for WGDs in the peach tribe and other groups. Ancestral character reconstruction for fruit types supports independent origins of fleshy fruits from dry-fruit ancestors, including the evolution of drupes (e.g., peach) and pomes (e.g., apple) from follicetum, and drupetum (raspberry and blackberry) from achenetum. We propose that WGDs and environmental factors, including animals, contributed to the evolution of the many fruits in Rosaceae, which provide a foundation for understanding fruit evolution.
The classification of the legume family proposed here addresses the long‐known non‐monophyly of the traditionally recognised subfamily Caesalpinioideae, by recognising six robustly supported monophyletic subfamilies. This new classification uses as its framework the most comprehensive phylogenetic analyses of legumes to date, based on plastid matK gene sequences, and including near‐complete sampling of genera (698 of the currently recognised 765 genera) and ca. 20% (3696) of known species. The matK gene region has been the most widely sequenced across the legumes, and in most legume lineages, this gene region is sufficiently variable to yield well‐supported clades. This analysis resolves the same major clades as in other phylogenies of whole plastid and nuclear gene sets (with much sparser taxon sampling). Our analysis improves upon previous studies that have used large phylogenies of the Leguminosae for addressing evolutionary questions, because it maximises generic sampling and provides a phylogenetic tree that is based on a fully curated set of sequences that are vouchered and taxonomically validated. The phylogenetic trees obtained and the underlying data are available to browse and download, facilitating subsequent analyses that require evolutionary trees. Here we propose a new community‐endorsed classification of the family that reflects the phylogenetic structure that is consistently resolved and recognises six subfamilies in Leguminosae: a recircumscribed Caesalpinioideae DC., Cercidoideae Legume Phylogeny Working Group (stat. nov.), Detarioideae Burmeist., Dialioideae Legume Phylogeny Working Group (stat. nov.), Duparquetioideae Legume Phylogeny Working Group (stat. nov.), and Papilionoideae DC. The traditionally recognised subfamily Mimosoideae is a distinct clade nested within the recircumscribed Caesalpinioideae and is referred to informally as the mimosoid clade pending a forthcoming formal tribal and/or clade‐based classification of the new Caesalpinioideae. We provide a key for subfamily identification, descriptions with diagnostic charactertistics for the subfamilies, figures illustrating their floral and fruit diversity, and lists of genera by subfamily. This new classification of Leguminosae represents a consensus view of the international legume systematics community; it invokes both compromise and practicality of use.
less than 100 words)GetOrganelle is a state-of-the-art toolkit to assemble accurate organelle genomes from NGS data. This toolkit recruit organelle-associated reads using a modified "baiting and iterative mapping" approach, conducts de novo assembly, filters and disentangles assembly graph, and produces all possible configurations of circular organelle genomes. For 50 published samples, we reassembled the circular plastome in 47 samples using GetOrganelle, but only in 12 samples using NOVOPlasty. In comparison with published/NOVOPlasty plastomes, we demonstrated that GetOrganelle assemblies are more accurate.Moreover, we assembled complete mitogenomes of fungi and animals using GetOrganelle. GetOrganelle is freely released under a GPL-3 license (https://github.com/Kinggerm/GetOrganelle).
Phylogenomic analyses have helped resolve many recalcitrant relationships in the angiosperm tree of life, yet phylogenetic resolution of the backbone of the Leguminosae, one of the largest and most economically and ecologically important families, remains poor due to generally limited molecular data and incomplete taxon sampling of previous studies. Here, we resolve many of the Leguminosae’s thorniest nodes through comprehensive analysis of plastome-scale data using multiple modified coding and noncoding data sets of 187 species representing almost all major clades of the family. Additionally, we thoroughly characterize conflicting phylogenomic signal across the plastome in light of the family’s complex history of plastome evolution. Most analyses produced largely congruent topologies with strong statistical support and provided strong support for resolution of some long-controversial deep relationships among the early diverging lineages of the subfamilies Caesalpinioideae and Papilionoideae. The robust phylogenetic backbone reconstructed in this study establishes a framework for future studies on legume classification, evolution, and diversification. However, conflicting phylogenetic signal was detected and quantified at several key nodes that prevent the confident resolution of these nodes using plastome data alone. [Leguminosae; maximum likelihood; phylogenetic conflict; plastome; recalcitrant relationships; stochasticity; systematic error.]
Background Flowering plants (angiosperms) are dominant components of global terrestrial ecosystems, but phylogenetic relationships at the familial level and above remain only partially resolved, greatly impeding our full understanding of their evolution and early diversification. The plastome, typically mapped as a circular genome, has been the most important molecular data source for plant phylogeny reconstruction for decades. Results Here, we assembled by far the largest plastid dataset of angiosperms, composed of 80 genes from 4792 plastomes of 4660 species in 2024 genera representing all currently recognized families. Our phylogenetic tree (PPA II) is essentially congruent with those of previous plastid phylogenomic analyses but generally provides greater clade support. In the PPA II tree, 75% of nodes at or above the ordinal level and 78% at or above the familial level were resolved with high bootstrap support (BP ≥ 90). We obtained strong support for many interordinal and interfamilial relationships that were poorly resolved previously within the core eudicots, such as Dilleniales, Saxifragales, and Vitales being resolved as successive sisters to the remaining rosids, and Santalales, Berberidopsidales, and Caryophyllales as successive sisters to the asterids. However, the placement of magnoliids, although resolved as sister to all other Mesangiospermae, is not well supported and disagrees with topologies inferred from nuclear data. Relationships among the five major clades of Mesangiospermae remain intractable despite increased sampling, probably due to an ancient rapid radiation. Conclusions We provide the most comprehensive dataset of plastomes to date and a well-resolved phylogenetic tree, which together provide a strong foundation for future evolutionary studies of flowering plants.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.