DNA barcoding involves sequencing a standard region of DNA as a tool for species identification. However, there has been no agreement on which region(s) should be used for barcoding land plants. To provide a community recommendation on a standard plant barcode, we have compared the performance of 7 leading candidate plastid DNA regions (atpF-atpH spacer, matK gene, rbcL gene, rpoB gene, rpoC1 gene, psbK-psbI spacer, and trnH-psbA spacer). Based on assessments of recoverability, sequence quality, and levels of species discrimination, we recommend the 2-locus combination of rbcL؉matK as the plant barcode. This core 2-locus barcode will provide a universal framework for the routine use of DNA sequence data to identify specimens and contribute toward the discovery of overlooked species of land plants.matK ͉ rbcL ͉ species identification L arge-scale standardized sequencing of the mitochondrial gene CO1 has made DNA barcoding an efficient species identification tool in many animal groups (1). In plants, however, low substitution rates of mitochondrial DNA have led to the search for alternative barcoding regions. From initial investigations of plastid regions (2-4), 7 leading candidates have emerged (5, 6). Four are portions of coding genes (matK, rbcL, rpoB, and rpoC1), and 3 are noncoding spacers (atpF-atpH, trnH-psbA, and psbK-psbI). Different research groups have proposed various combinations of these loci as their preferred plant barcodes, but no consensus has emerged (5-12). This lack of an agreed standard has impeded progress in plant barcoding.Our aim here is to identify a standard DNA barcode for land plants. To achieve this goal, we have pooled data across laboratories including sequence data from 907 samples, representing 445 angiosperm, 38 gymnosperm, and 67 cryptogam species. Using various subsets of these data, we evaluated the 7 candidate loci using criteria in the Consortium for the Barcode of Life's (CBOL) data standards and guidelines for locus selection (http:// www.barcoding.si.edu/protocols.html). Universality: Which loci can be routinely sequenced across the land plants? Sequence quality and coverage: Which loci are most amenable to the production of bidirectional sequences with few or no ambiguous base calls? Discrimination: Which loci enable most species to be distinguished? ResultsUniversality. Direct universality assessments using a single primer pair for each locus in angiosperms resulted in 90%-98% PCR and sequencing success for 6/7 regions. Success for the seventh region, psbK-psbI, was 77% (Fig. 1A). Greater problems were encountered in other land plant groups, with rpoB, matK, atpF-atpH, and psbK-psbI all showing Ͻ50% success in gymnosperms and/or cryptogams based on data compiled from several laboratories (Fig. 1 A).Sequence Quality. Evaluation of sequence quality and coverage from the candidate loci demonstrated that high quality bidirectional sequences were routinely obtained from rbcL, rpoC1, and rpoB (Fig. 1B, x axis). The remaining 4 loci required more manual editing and produced f...
A universal barcode system for land plants would be a valuable resource, with potential utility in fields as diverse as ecology, floristics, law enforcement and industry. However, the application of plant barcoding has been constrained by a lack of consensus regarding the most variable and technically practical DNA region(s). We compared eight candidate plant barcoding regions from the plastome and one from the mitochondrial genome for how well they discriminated the monophyly of 92 species in 32 diverse genera of land plants (N = 251 samples). The plastid markers comprise portions of five coding (rpoB, rpoC1, rbcL, matK and 23S rDNA) and three non-coding (trnH-psbA, atpF–atpH, and psbK–psbI) loci. Our survey included several taxonomically complex groups, and in all cases we examined multiple populations and species. The regions differed in their ability to discriminate species, and in ease of retrieval, in terms of amplification and sequencing success. Single locus resolution ranged from 7% (23S rDNA) to 59% (trnH-psbA) of species with well-supported monophyly. Sequence recovery rates were related primarily to amplification success (85–100% for plastid loci), with matK requiring the greatest effort to achieve reasonable recovery (88% using 10 primer pairs). Several loci (matK, psbK–psbI, trnH-psbA) were problematic for generating fully bidirectional sequences. Setting aside technical issues related to amplification and sequencing, combining the more variable plastid markers provided clear benefits for resolving species, although with diminishing returns, as all combinations assessed using four to seven regions had only marginally different success rates (69–71%; values that were approached by several two- and three-region combinations). This performance plateau may indicate fundamental upper limits on the precision of species discrimination that is possible with DNA barcoding systems that include moderate numbers of plastid markers. Resolution to the contentious debate on plant barcoding should therefore involve increased attention to practical issues related to the ease of sequence recovery, global alignability, and marker redundancy in multilocus plant DNA barcoding systems.
The ability to discriminate between species using barcoding loci has proved more difficult in plants than animals, raising the possibility that plant species boundaries are less well defined. Here, we review a selection of published barcoding data sets to compare species discrimination in plants vs. animals. Although the use of different genetic markers, analytical methods and depths of taxon sampling may complicate comparisons, our results using common metrics demonstrate that the number of species supported as monophyletic using barcoding markers is higher in animals (> 90%) than plants (~70%), even after controlling for the amount of parsimony-informative information per species. This suggests that more than a simple lack of variability limits species discrimination in plants. Both animal and plant species pairs have variable size gaps between intra- and interspecific genetic distances, but animal species tend to have larger gaps than plants, even in relatively densely sampled genera. An analysis of 12 plant genera suggests that hybridization contributes significantly to variation in genetic discontinuity in plants. Barcoding success may be improved in some plant groups by careful choice of markers and appropriate sampling; however, overall fine-scale species discrimination in plants relative to animals may be inherently more difficult because of greater levels of gene-tree paraphyly.
Summary 1.A major goal of DNA barcoding is to identify species in local floras and ecological communities. With the consensus of a two-locus DNA barcode (rbcL+matK) by the Consortium for the Barcode of Life (CBOL) Plant Working Group (2009), barcoding efforts have begun to focus on building the barcode library for land plants.2. Here, we establish a barcoding database for a temperate flora of moderate taxonomic breadth at the Koffler Scientific Reserve, Ontario, Canada based on the rbcL+matK barcode. We evaluated the performance of this combination in comparison with three other potential supplementary regions (the coding region rpoC1 and two non-coding intergenic spacers trnH-psbA and atpF-atpH). We examined these markers singly and in combination to evaluate their discriminatory power among 436 species in 269 genera of land plants. 3. Using high-throughput techniques, we recovered a high-quality sequence from at least one region for 98.2% of the 513 samples screened; 55% had complete coverage across all five gene regions. Sequencing success was highest for rbcL (91.4% of samples collected) and lowest for rpoC1 (74.5%). The two coding regions rbcL and matK provided a relatively high number of high-quality bi-directional sequences compared with the non-coding intergenic spacers, and in combination were able to correctly identify 93.1% of the species sampled. Marginal increases in species resolution were obtained with the inclusion of the trnH-psbA intergenic spacer (95.3%), or by using all five gene regions combined (97.3%). 4. There was a weak relation between the number of species per genus and identification success rate using rbcL+matK; 100% for monotypic genera (70.5% of the flora) and 83.6% for polytypic genera. Identification success using the rbcL+matK barcode was higher (100%) for gymnosperms, bryophytes, lycophytes and monilophytes (collectively representing 5% of the flora), compared with angiosperms (92.7%). 5. Our results indicate that the rbcL+matK barcode can provide an acceptably high rate of species resolution in the context of this and other local northern temperate floras. It does so in a cost-effective manner, with relatively modest laboratory effort, and despite the presence of missing data from individual plastid regions in a subset of samples.
Willows (Salix: Salicaceae) form a major ecological component of Holarctic floras and consequently are an obvious target for a DNA-based identification system. We surveyed two to seven plastid genome regions (~3.8 kb; ~3% of the genome) from 71 Salix species across all five subgenera, to assess their performance as DNA barcode markers. Although Salix has a relatively high level of interspecific hybridization, this may not sufficiently explain the near complete failure of barcoding that we observed: only one species had a unique barcode. We recovered 39 unique haplotypes, from more than 500 specimens, that could be partitioned into six major haplotype groups. A unique variant of group I (haplotype 1*) was shared by 53 species in three of five Salix subgenera. This unusual pattern of haplotype sharing across infrageneric taxa is suggestive of either a massive nonrandom coalescence failure (incomplete lineage sorting), or of repeated plastid capture events, possibly including a historical selective sweep of haplotype 1* across taxonomic sections. The former is unlikely as molecular dating indicates that haplotype 1* originated recently and is nested in the oldest major haplotype group in the genus. Further, we detected significant non-neutrality in the frequency spectrum of mutations in group I, but not outside group I, and demonstrated a striking absence of geographical (isolation by distance) effects in the haplotype distributions of this group. The most likely explanation for the patterns we observed involves recent repeated plastid capture events, aided by widespread hybridization and long-range seed dispersal, but primarily propelled by one or more trans-species selective sweeps.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.