Recent estimates suggest that the global insect fauna includes fewer than six million species, but this projection is very uncertain because taxonomic work has been limited on some highly diverse groups. Validation of current estimates minimally requires the investigation of all lineages that are diverse enough to have a substantial impact on the final species count. This study represents a first step in this direction; it employs DNA barcoding to evaluate patterns of species richness in 27 orders of Canadian insects. The analysis of over one million specimens revealed species counts congruent with earlier results for most orders. However, Diptera and Hymenoptera were unexpectedly diverse, representing two-thirds of the 46 937 barcode index numbers (=species) detected. Correspondence checks between known species and barcoded taxa showed that sampling was incomplete, a result confirmed by extrapolations from the barcode results which suggest the occurrence of at least 94 000 species of insects in Canada, a near doubling from the prior estimate of 54 000 species. One dipteran family, the Cecidomyiidae, was extraordinarily diverse with an estimated 16 000 species, a 10-fold increase from its predicted diversity. If Canada possesses about 1% of the global fauna, as it does for known taxa, the results of this study suggest the presence of 10 million insect species with about 1.8 million of these taxa in the Cecidomyiidae. If so, the global species count for this fly family may exceed the combined total for all 142 beetle families. If extended to more geographical regions and to all hyperdiverse groups, DNA barcoding can rapidly resolve the current uncertainty surrounding a species count for the animal kingdom. A newly detailed understanding of species diversity may illuminate processes important in speciation, as suggested by the discovery that the most diverse insect lineages in Canada employ an unusual mode of reproduction, haplodiploidy.This article is part of the themed issue ‘From DNA barcodes to biomes’.
DNA barcoding protocols require the linkage of each sequence record to a voucher specimen that has, whenever possible, been authoritatively identified. Natural history collections would seem an ideal resource for barcode library construction, but they have never seen large-scale analysis because of concerns linked to DNA degradation. The present study examines the strength of this barrier, carrying out a comprehensive analysis of moth and butterfly (Lepidoptera) species in the Australian National Insect Collection. Protocols were developed that enabled tissue samples, specimen data, and images to be assembled rapidly. Using these methods, a five-person team processed 41,650 specimens representing 12,699 species in 14 weeks. Subsequent molecular analysis took about six months, reflecting the need for multiple rounds of PCR as sequence recovery was impacted by age, body size, and collection protocols. Despite these variables and the fact that specimens averaged 30.4 years old, barcode records were obtained from 86% of the species. In fact, one or more barcode compliant sequences (>487 bp) were recovered from virtually all species represented by five or more individuals, even when the youngest was 50 years old. By assembling specimen images, distributional data, and DNA barcode sequences on a web-accessible informatics platform, this study has greatly advanced accessibility to information on thousands of species. Moreover, much of the specimen data became publically accessible within days of its acquisition, while most sequence results saw release within three months. As such, this study reveals the speed with which DNA barcode workflows can mobilize biodiversity data, often providing the first web-accessible information for a species. These results further suggest that existing collections can enable the rapid development of a comprehensive DNA barcode library for the most diverse compartment of terrestrial biodiversity – insects.
DNA barcoding aims to accelerate species identification and discovery, but performance tests have shown marked differences in identification success. As a consequence, there remains a great need for comprehensive studies which objectively test the method in groups with a solid taxonomic framework. This study focuses on the 180 species of butterflies in Romania, accounting for about one third of the European butterfly fauna. This country includes five eco-regions, the highest of any in the European Union, and is a good representative for temperate areas. Morphology and DNA barcodes of more than 1300 specimens were carefully studied and compared. Our results indicate that 90 per cent of the species form barcode clusters allowing their reliable identification. The remaining cases involve nine closely related species pairs, some whose taxonomic status is controversial or that hybridize regularly. Interestingly, DNA barcoding was found to be the most effective identification tool, outperforming external morphology, and being slightly better than male genitalia. Romania is now the first country to have a comprehensive DNA barcode reference database for butterflies. Similar barcoding efforts based on comprehensive sampling of specific geographical regions can act as functional modules that will foster the early application of DNA barcoding while a global system is under development.
Although DNA metabarcoding is an attractive approach for monitoring biodiversity, it is often difficult to detect all the species present in a bulk sample. In particular, sequence recovery for a given species depends on its biomass and mitome copy number as well as the primer set employed for PCR. To examine these variables, we constructed a mock community of terrestrial arthropods comprised of 374 species. We used this community to examine how species recovery was impacted when amplicon pools were constructed in four ways. The first two protocols involved the construction of bulk DNA extracts from different body segments (Bulk Abdomen, Bulk Leg). The other protocols involved the production of DNA extracts from single legs which were then merged prior to PCR (Composite Leg) or PCR‐amplified separately (Single Leg) and then pooled. The amplicons generated by these four treatments were then sequenced on three platforms (Illumina MiSeq, Ion Torrent PGM and Ion Torrent S5). The choice of sequencing platform did not substantially influence species recovery, although the Miseq delivered the highest sequence quality. As expected, species recovery was most efficient from the Single Leg treatment because amplicon abundance varied little among taxa. Among the three treatments where PCR occurred after pooling, the Bulk Abdomen treatment produced a more uniform read abundance than the Bulk Leg or Composite Leg treatment. Primer choice also influenced species recovery and evenness. Our results reveal how variation in protocols can have substantial impacts on perceived diversity unless sequencing coverage is sufficient to reach an asymptote.
BackgroundAlthough high-throughput sequencers (HTS) have largely displaced their Sanger counterparts, the short read lengths and high error rates of most platforms constrain their utility for amplicon sequencing. The present study tests the capacity of single molecule, real-time (SMRT) sequencing implemented on the SEQUEL platform to overcome these limitations, employing 658 bp amplicons of the mitochondrial cytochrome c oxidase I gene as a model system.ResultsBy examining templates from more than 5000 species and 20,000 specimens, the performance of SMRT sequencing was tested with amplicons showing wide variation in GC composition and varied sequence attributes. SMRT and Sanger sequences were very similar, but SMRT sequencing provided more complete coverage, especially for amplicons with homopolymer tracts. Because it can characterize amplicon pools from 10,000 DNA extracts in a single run, the SEQUEL can reduce greatly reduce sequencing costs in comparison to first (Sanger) and second generation platforms (Illumina, Ion).ConclusionsSMRT analysis generates high-fidelity sequences from amplicons with varying GC content and is resilient to homopolymer tracts. Analytical costs are low, substantially less than those for first or second generation sequencers. When implemented on the SEQUEL platform, SMRT analysis enables massive amplicon characterization because each instrument can recover sequences from more than 5 million DNA extracts a year.Electronic supplementary materialThe online version of this article (10.1186/s12864-018-4611-3) contains supplementary material, which is available to authorized users.
DNA barcoding employs short, standardized gene regions (5' segment of mitochondrial cytochrome oxidase subunit I for animals) as an internal tag to enable species identification. Prior studies have indicated that it performs this task well, because interspecific variation at cytochrome oxidase subunit I is typically much greater than intraspecific variation. However, most previous studies have focused on local faunas only, and critics have suggested two reasons why barcoding should be less effective in species identification when the geographical coverage is expanded. They suggested that many recently diverged taxa will be excluded from local analyses because they are allopatric. Second, intraspecific variation may be seriously underestimated by local studies, because geographical variation in the barcode region is not considered. In this paper, we analyse how adding a geographical dimension affects barcode resolution, examining 353 butterfly species from Central Asia. Despite predictions, we found that geographically separated and recently diverged allopatric species did not show, on average, less sequence differentiation than recently diverged sympatric taxa. Although expanded geographical coverage did substantially increase intraspecific variation reducing the barcoding gap between species, this did not decrease species identification using neighbour-joining clustering. The inclusion of additional populations increased the number of paraphyletic entities, but did not impede species-level identification, because paraphyletic species were separated from their monophyletic relatives by substantial sequence divergence. Thus, this study demonstrates that DNA barcoding remains an effective identification tool even when taxa are sampled from a large geographical area.
Metabarcoding can rapidly determine the species composition of bulk samples and thus aids biodiversity and ecosystem assessment. However, it is essential to use primer sets that minimize amplification bias among taxa to maximize species recovery. Despite this fact, the performance of primer sets employed for metabarcoding terrestrial arthropods has not been sufficiently evaluated. This study tests the performance of 36 primer sets on a mock community containing 374 insect species. Amplification success was assessed with gradient PCRs and the 21 most promising primer sets selected for metabarcoding. These 21 primer sets were also tested by metabarcoding a Malaise trap sample. We identified eight primer sets, mainly those including inosine and/or high degeneracy, that recovered more than 95% of the species in the mock community. Results from the Malaise trap sample were congruent with the mock community, but primer sets generating short amplicons produced potential false positives. Taxon recovery from both mock community and Malaise trap sample metabarcoding were used to select four primer sets for additional evaluation at different annealing temperatures (40–60 °C) using the mock community. The effect of temperature varied by primer pair but overall it only had a minor effect on taxon recovery. This study reveals the weak performance of some primer sets employed in past studies. It also demonstrates that certain primer sets can recover most taxa in a diverse species assemblage. Thus, based our experimental set up, there is no need to employ several primer sets targeting the same gene region. We identify several suitable primer sets for arthropod metabarcoding, and specifically recommend BF3 + BR2, as it is not affected by primer slippage and provides maximal taxonomic resolution. The fwhF2 + fwhR2n primer set amplifies a shorter fragment and is therefore ideal when targeting degraded DNA (e.g., from gut contents).
BackgroundDNA-based testing has been gaining acceptance as a tool for authentication of a wide range of food products; however, its applicability for testing of herbal supplements remains contentious.MethodsWe utilized Sanger and Next-Generation Sequencing (NGS) for taxonomic authentication of fifteen herbal supplements representing three different producers from five medicinal plants: Echinacea purpurea, Valeriana officinalis, Ginkgo biloba, Hypericum perforatum and Trigonella foenum-graecum. Experimental design included three modifications of DNA extraction, two lysate dilutions, Internal Amplification Control, and multiple negative controls to exclude background contamination. Ginkgo supplements were also analyzed using HPLC-MS for the presence of active medicinal components.ResultsAll supplements yielded DNA from multiple species, rendering Sanger sequencing results for rbcL and ITS2 regions either uninterpretable or non-reproducible between the experimental replicates. Overall, DNA from the manufacturer-listed medicinal plants was successfully detected in seven out of eight dry herb form supplements; however, low or poor DNA recovery due to degradation was observed in most plant extracts (none detected by Sanger; three out of seven–by NGS). NGS also revealed a diverse community of fungi, known to be associated with live plant material and/or the fermentation process used in the production of plant extracts. HPLC-MS testing demonstrated that Ginkgo supplements with degraded DNA contained ten key medicinal components.ConclusionQuality control of herbal supplements should utilize a synergetic approach targeting both DNA and bioactive components, especially for standardized extracts with degraded DNA. The NGS workflow developed in this study enables reliable detection of plant and fungal DNA and can be utilized by manufacturers for quality assurance of raw plant materials, contamination control during the production process, and the final product. Interpretation of results should involve an interdisciplinary approach taking into account the processes involved in production of herbal supplements, as well as biocomplexity of plant-plant and plant-fungal biological interactions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.