DNA barcoding involves sequencing a standard region of DNA as a tool for species identification. However, there has been no agreement on which region(s) should be used for barcoding land plants. To provide a community recommendation on a standard plant barcode, we have compared the performance of 7 leading candidate plastid DNA regions (atpF-atpH spacer, matK gene, rbcL gene, rpoB gene, rpoC1 gene, psbK-psbI spacer, and trnH-psbA spacer). Based on assessments of recoverability, sequence quality, and levels of species discrimination, we recommend the 2-locus combination of rbcL؉matK as the plant barcode. This core 2-locus barcode will provide a universal framework for the routine use of DNA sequence data to identify specimens and contribute toward the discovery of overlooked species of land plants.matK ͉ rbcL ͉ species identification L arge-scale standardized sequencing of the mitochondrial gene CO1 has made DNA barcoding an efficient species identification tool in many animal groups (1). In plants, however, low substitution rates of mitochondrial DNA have led to the search for alternative barcoding regions. From initial investigations of plastid regions (2-4), 7 leading candidates have emerged (5, 6). Four are portions of coding genes (matK, rbcL, rpoB, and rpoC1), and 3 are noncoding spacers (atpF-atpH, trnH-psbA, and psbK-psbI). Different research groups have proposed various combinations of these loci as their preferred plant barcodes, but no consensus has emerged (5-12). This lack of an agreed standard has impeded progress in plant barcoding.Our aim here is to identify a standard DNA barcode for land plants. To achieve this goal, we have pooled data across laboratories including sequence data from 907 samples, representing 445 angiosperm, 38 gymnosperm, and 67 cryptogam species. Using various subsets of these data, we evaluated the 7 candidate loci using criteria in the Consortium for the Barcode of Life's (CBOL) data standards and guidelines for locus selection (http:// www.barcoding.si.edu/protocols.html). Universality: Which loci can be routinely sequenced across the land plants? Sequence quality and coverage: Which loci are most amenable to the production of bidirectional sequences with few or no ambiguous base calls? Discrimination: Which loci enable most species to be distinguished? ResultsUniversality. Direct universality assessments using a single primer pair for each locus in angiosperms resulted in 90%-98% PCR and sequencing success for 6/7 regions. Success for the seventh region, psbK-psbI, was 77% (Fig. 1A). Greater problems were encountered in other land plant groups, with rpoB, matK, atpF-atpH, and psbK-psbI all showing Ͻ50% success in gymnosperms and/or cryptogams based on data compiled from several laboratories (Fig. 1 A).Sequence Quality. Evaluation of sequence quality and coverage from the candidate loci demonstrated that high quality bidirectional sequences were routinely obtained from rbcL, rpoC1, and rpoB (Fig. 1B, x axis). The remaining 4 loci required more manual editing and produced f...
DNA barcoding is a technique in which species identification is performed by using DNA sequences from a small fragment of the genome, with the aim of contributing to a wide range of ecological and conservation studies in which traditional taxonomic identification is not practical. DNA barcoding is well established in animals, but there is not yet any universally accepted barcode for plants. Here, we undertook intensive field collections in two biodiversity hotspots (Mesoamerica and southern Africa). Using >1,600 samples, we compared eight potential barcodes. Going beyond previous plant studies, we assessed to what extent a ''DNA barcoding gap'' is present between intra-and interspecific variations, using multiple accessions per species. Given its adequate rate of variation, easy amplification, and alignment, we identified a portion of the plastid matK gene as a universal DNA barcode for flowering plants. Critically, we further demonstrate the applicability of DNA barcoding for biodiversity inventories. In addition, analyzing >1,000 species of Mesoamerican orchids, DNA barcoding with matK alone reveals cryptic species and proves useful in identifying species listed in Convention on International Trade of Endangered Species (CITES) appendixes.
Sequencing of target-enriched libraries is an efficient and cost-effective method for obtaining DNA sequence data from hundreds of nuclear loci for phylogeny reconstruction. Much of the cost of developing targeted sequencing approaches is associated with the generation of preliminary data needed for the identification of orthologous loci for probe design. In plants, identifying orthologous loci has proven difficult due to a large number of whole-genome duplication events, especially in the angiosperms (flowering plants). We used multiple sequence alignments from over 600 angiosperms for 353 putatively single-copy protein-coding genes identified by the One Thousand Plant Transcriptomes Initiative to design a set of targeted sequencing probes for phylogenetic studies of any angiosperm group. To maximize the phylogenetic potential of the probes, while minimizing the cost of production, we introduce a k-medoids clustering approach to identify the minimum number of sequences necessary to represent each coding sequence in the final probe set. Using this method, 5–15 representative sequences were selected per orthologous locus, representing the sequence diversity of angiosperms more efficiently than if probes were designed using available sequenced genomes alone. To test our approximately 80,000 probes, we hybridized libraries from 42 species spanning all higher-order groups of angiosperms, with a focus on taxa not present in the sequence alignments used to design the probes. Out of a possible 353 coding sequences, we recovered an average of 283 per species and at least 100 in all species. Differences among taxa in sequence recovery could not be explained by relatedness to the representative taxa selected for probe design, suggesting that there is no phylogenetic bias in the probe set. Our probe set, which targeted 260 kbp of coding sequence, achieved a median recovery of 137 kbp per taxon in coding regions, a maximum recovery of 250 kbp, and an additional median of 212 kbp per taxon in flanking non-coding regions across all species. These results suggest that the Angiosperms353 probe set described here is effective for any group of flowering plants and would be useful for phylogenetic studies from the species level to higher-order groups, including the entire angiosperm clade itself.
SummaryThe origin of fire-adapted lineages is a long-standing question in ecology. Although phylogeny can provide a significant contribution to the ongoing debate, its use has been precluded by the lack of comprehensive DNA data. Here, we focus on the 'underground trees' (=geoxy-les) of southern Africa, one of the most distinctive growth forms characteristic of fire-prone savannas.We placed geoxyles within the most comprehensive dated phylogeny for the regional flora comprising over 1400 woody species. Using this phylogeny, we tested whether African geoxyles evolved concomitantly with those of the South American cerrado and used their phylogenetic position to date the appearance of humid savannas.We found multiple independent origins of the geoxyle life-form mostly from the Pliocene, a period consistent with the origin of cerrado, with the majority of divergences occurring within the last 2 million yr. When contrasted with their tree relatives, geoxyles occur in regions characterized by higher rainfall and greater fire frequency.Our results indicate that the geoxylic growth form may have evolved in response to the interactive effects of frequent fires and high precipitation. As such, geoxyles may be regarded as markers of fire-maintained savannas occurring in climates suitable for forests.
Several major lineages with geographical coherence, as identified in previous studies based on smaller data sets, are supported. Other lineages with either geographical or ecological correspondence are recognized for the first time. Coffea subgenus Baracoffea is shown to be monophyletic, but Coffea subgenus Coffea is paraphyletic. Sequence data do not substantiate the monophyly of either Coffea or Psilanthus. Low levels of sequence divergence do not allow detailed resolution of relationships within Coffea, most notably for species of Coffea subgenus Coffea occurring in Madagascar. The origin of C. arabica by recent hybridization between C. canephora and C. eugenioides is supported. Phylogenetic separation resulting from the presence of the Dahomey Gap is inferred based on sequence data from Coffea.
Savannas first began to spread across Africa during the Miocene. A major hypothesis for explaining this vegetation change is the increase in C 4 grasses, promoting fire. We investigated whether mammals could also have contributed to savanna expansion by using spinescence as a marker of mammal herbivory. Looking at the present distribution of 1,852 tree species, we established that spinescence is mainly associated with two functional types of mammals: large browsers and medium-sized mixed feeders. Using a dated phylogeny for the same tree species, we found that spinescence evolved at least 55 times. The diversification of spiny plants occurred long after the evolution of Afrotherian proboscideans and hyracoids. However, it is remarkably congruent with diversification of bovids, the lineage including the antelope that predominantly browse these plants today. Our findings suggest that herbivore-adapted savannas evolved several million years before fire-maintained savannas and probably, in different environmental conditions. Spiny savannas with abundant mammal herbivores occur in drier climates and on nutrient-rich soils, whereas fire-maintained savannas occur in wetter climates on nutrient-poor soils.Africa | Bovidae | coevolution | mammalian herbivory | savanna T he origin and spread of savannas have been topics of intensive research, but many questions remain. The C 4 grasses that dominate savannas emerged in the late Oligocene (∼30 Ma), but savannas only began to emerge as one of the world's major biomes in the late Miocene more than 20 My later (1). What changed to roll back the forests, allowing the rapid spread of grasslands? Ehleringer et al. (2) first linked the rise of savannas to a drop in atmospheric CO 2 , which would favor C 4 grasses over their C 3 grass predecessors. Low CO 2 can also reduce woody cover by increasing the risk of recruitment failure in woody plants whether from drought, fire, or browsing (3). However, the timing of the onset of low CO 2 is much earlier than the spread of savannas; therefore, although low CO 2 may have contributed to savanna expansion, it cannot explain the long time lag between C 4 origins and savanna spread. Climate change is the usual explanation for changing vegetation over time. Increased aridity in the late Miocene has been shown to cause the retreat of forests in North America and Eurasia, allowing grasslands to spread in their place (4, 5). However, large areas of extant savannas occur in climates that are wet enough to support forests and other closed woody types (6-8). Fires are frequent in high-rainfall savannas and have been considered the major agents accounting for open ecosystems in climates that can support forests. Fossil charcoal, mostly from marine cores, shows a surge in fire activity from the late Miocene correlated with the spread of savannas (9, 10). Phylogenetic studies have shown the emergence of fire-adapted woody plants from the late Miocene through to the Pleistocene in both Brazil and Africa, consistent with fossil evidence for increasing f...
Previous phylogenetic studies have indicated that Acacia Miller s.l. is polyphyletic and in need of reclassification. A proposal to conserve the name Acacia for the larger Australian contingent of the genus (formerly subgenus Phyllodineae) resulted in the retypification of the genus with the Australian A. penninervis. However, Acacia s.l. comprises at least four additional distinct clades or genera, some still requiring formal taxonomic transfer of species. These include Vachellia (formerly subgenus Acacia), Senegalia (formerly subgenus Aculeiferum), Acaciella (formerly subgenus Aculeiferum section Filicinae) and Mariosousa (formerly the A. coulteri group). In light of this fragmentation of Acacia s.l., there is a need to assess relationships of the non‐Australian taxa. A molecular phylogenetic study of Acacia s.l and close relatives occurring in Africa was conducted using sequence data from matK/trnK, trnL‐trnF and psbA‐trnH with the aim of determining the placement of the African species in the new generic system. The results reinforce the inevitability of recognizing segregate genera for Acacia s.l. and new combinations for the African species in Senegalia and Vachellia are formalized. © 2013 The Linnean Society of London, Botanical Journal of the Linnean Society, 2013, 172, 500–523.
The world’s herbaria collectively house millions of diverse plant specimens, including endangered or extinct species and type specimens. Unlocking genetic data from the typically highly degraded DNA obtained from herbarium specimens was difficult until the arrival of high-throughput sequencing approaches, which can be applied to low quantities of severely fragmented DNA. Target enrichment involves using short molecular probes that hybridise and capture genomic regions of interest for high-throughput sequencing. In this study on herbariomics, we used this targeted sequencing approach and the Angiosperms353 universal probe set to recover up to 351 nuclear genes from 435 herbarium specimens that are up to 204 years old and span the breadth of angiosperm diversity. We show that on average 207 genes were successfully retrieved from herbarium specimens, although the mean number of genes retrieved and target enrichment efficiency is significantly higher for silica gel-dried specimens. Forty-seven target nuclear genes were recovered from a herbarium specimen of the critically endangered St Helena boxwood, Mellissia begoniifolia, collected in 1815. Herbarium specimens yield significantly less high-molecular-weight DNA than silica gel-dried specimens, and genomic DNA quality declines with sample age, which is negatively correlated with target enrichment efficiency. Climate, taxon-specific traits, and collection strategies additionally impact target sequence recovery. We also detected taxonomic bias in targeted sequencing outcomes for the 10 most numerous angiosperm families that were investigated in depth. We recommend that (1) for species distributed in wet tropical climates, silica gel-dried specimens should be used preferentially; (2) for species distributed in seasonally dry tropical climates, herbarium and silica gel-dried specimens yield similar results, and either collection can be used; (3) taxon-specific traits should be explored and established for effective optimisation of taxon-specific studies using herbarium specimens; (4) all herbarium sheets should, in future, be annotated with details of the preservation method used; (5) long-term storage of herbarium specimens should be in stable, low-humidity, and low-temperature environments; and (6) targeted sequencing with universal probes, such as Angiosperms353, should be investigated closely as a new approach for DNA barcoding that will ensure better exploitation of herbarium specimens than traditional Sanger sequencing approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.