Abstract1. Metabarcoding of environmental samples has many challenges and limitations that require carefully considered laboratory and analysis workflows to ensure reliable results. We explore how decisions regarding study design, laboratory set-up, and bioinformatic processing affect the final results, and provide guidelines for reliable study of environmental samples.2. We evaluate the performance of four primer sets targeting COI and 16S regions characterizing arthropod diversity in bat faecal samples, and investigate how metabarcoding results are affected by parameters including: (1) number of PCR replicates per sample, (2) sequencing depth, (3) PCR replicate processing strategy (i.e. either additively, by combining the sequences obtained from the PCR replicates, or restrictively, by only retaining sequences that occur in multiple PCR replicates for each sample), (4) minimum copy number for sequences to be retained, (5) chimera removal, and (6) similarity thresholds for Operational Taxonomic Unit (OTU) clustering. Lastly, we measure within-and between-taxa dissimilarities when using sequences from public databases to determine the most appropriate thresholds for OTU clustering and taxonomy assignment.3. Our results show that the use of multiple primer sets reduces taxonomic biases and increases taxonomic coverage. Taxonomic profiles resulting from each primer set are principally affected by how many PCR replicates are carried out per sample and how sequences are filtered across them, the sequence copy number threshold and the OTU clustering threshold. We also report considerable diversity differences between PCR replicates from each sample. Sequencing depth increases the dissimilarity between PCR replicates unless the bioinformatic strategies to remove allegedly artefactual sequences are adjusted according to the number of analysed sequences. Finally, we show that the appropriate identity thresholds for OTU clustering and taxonomy assignment differ between markers.4. Metabarcoding of complex environmental samples ideally requires (1) investigation of whether more than one primer sets targeting the same taxonomic group is needed to offset primer biases, (2) more than one PCR replicate per sample, (3) bioinformatic processing of sequences that balance diversity detection with removal of artefactual sequences, and (4) empirical selection of OTU clustering and taxonomy assignment thresholds tailored to each marker and the obtained taxa.
Metabarcoding of environmental samples on second-generation sequencing platforms has rapidly become a valuable tool for ecological studies. A fundamental assumption of this approach is the reliance on being able to track tagged amplicons back to the samples from which they originated. In this study, we address the problem of sequences in metabarcoding sequencing outputs with false combinations of used tags (tag jumps). Unless these sequences can be identified and excluded from downstream analyses, tag jumps creating sequences with false, but already used tag combinations, can cause incorrect assignment of sequences to samples and artificially inflate diversity. In this study, we document and investigate tag jumping in metabarcoding studies on Illumina sequencing platforms by amplifying mixed-template extracts obtained from bat droppings and leech gut contents with tagged generic arthropod and mammal primers, respectively. We found that an average of 2.6% and 2.1% of sequences had tag combinations, which could be explained by tag jumping in the leech and bat diet study, respectively. We suggest that tag jumping can happen during blunt-ending of pools of tagged amplicons during library build and as a consequence of chimera formation during bulk amplification of tagged amplicons during library index PCR. We argue that tag jumping and contamination between libraries represents a considerable challenge for Illumina-based metabarcoding studies, and suggest measures to avoid false assignment of tag jumping-derived sequences to samples.
DNA obtained from environmental samples such as sediments, ice or water (environmental DNA, eDNA), represents an important source of information on past and present biodiversity. It has revealed an ancient forest in Greenland, extended by several thousand years the survival dates for mainland woolly mammoth in Alaska, and pushed back the dates for spruce survival in Scandinavian ice-free refugia during the last glaciation. More recently, eDNA was used to uncover the past 50 000 years of vegetation history in the Arctic, revealing massive vegetation turnover at the Pleistocene/Holocene transition, with implications for the extinction of megafauna. Furthermore, eDNA can reflect the biodiversity of extant flora and fauna, both qualitatively and quantitatively, allowing detection of rare species. As such, trace studies of plant and vertebrate DNA in the environment have revolutionized our knowledge of biogeography. However, the approach remains marred by biases related to DNA behaviour in environmental settings, incomplete reference databases and false positive results due to contamination. We provide a review of the field.
Given the diversity of prey consumed by insectivorous bats, it is difficult to discern the composition of their diet using morphological or conventional PCR-based analyses of their faeces. We demonstrate the use of a powerful alternate tool, the use of the Roche FLX sequencing platform to deep-sequence uniquely 5′ tagged insect-generic barcode cytochrome c oxidase I (COI) fragments, that were PCR amplified from faecal pellets of two free-tailed bat species Chaerephon pumilus and Mops condylurus (family: Molossidae). Although the analyses were challenged by the paucity of southern African insect COI sequences in the GenBank and BOLD databases, similarity to existing collections allowed the preliminary identification of 25 prey families from six orders of insects within the diet of C. pumilus, and 24 families from seven orders within the diet of M. condylurus. Insects identified to families within the orders Lepidoptera and Diptera were widely present among the faecal samples analysed. The two families that were observed most frequently were Noctuidae and Nymphalidae (Lepidoptera). Species-level analysis of the data was accomplished using novel bioinformatics techniques for the identification of molecular operational taxonomic units (MOTU). Based on these analyses, our data provide little evidence of resource partitioning between sympatric M. condylurus and C. pumilus in the Simunye region of Swaziland at the time of year when the samples were collected, although as more complete databases against which to compare the sequences are generated this may have to be re-evaluated.
The application of high‐throughput sequencing‐based approaches to DNA extracted from environmental samples such as gut contents and faeces has become a popular tool for studying dietary habits of animals. Due to the high resolution and prey detection capacity they provide, both metabarcoding and shotgun sequencing are increasingly used to address ecological questions grounded in dietary relationships. Despite their great promise in this context, recent research has unveiled how a wealth of biological (related to the study system) and technical (related to the methodology) factors can distort the signal of taxonomic composition and diversity. Here, we review these studies in the light of high‐throughput sequencing‐based assessment of trophic interactions. We address how the study design can account for distortion factors, and how acknowledging limitations and biases inherent to sequencing‐based diet analyses are essential for obtaining reliable results, thus drawing appropriate conclusions. Furthermore, we suggest strategies to minimize the effect of distortion factors, measures to increase reproducibility, replicability and comparability of studies, and options to scale up DNA sequencing‐based diet analyses. In doing so, we aim to aid end‐users in designing reliable diet studies by informing them about the complexity and limitations of DNA sequencing‐based diet analyses, and encourage researchers to create and improve tools that will eventually drive this field to its maturity.
Preface 54There is much interest in using Earth Observation (EO) technology to track biodiversity, 55 ecosystem functions, and ecosystem services, understandable given the fast pace of 56 biodiversity loss. However, because most biodiversity is invisible to EO, EO-based 57 indicators could be misleading, which can reduce the effectiveness of nature 58 conservation and even unintentionally decrease conservation effort. We describe an 59 approach that combines automated recording devices, high-throughput DNA Meeting the Aichi Biodiversity Targets 64From Google Earth to airborne sensors, the Copernicus Sentinels, and cube satellites, 65Earth Observation is undergoing a rapid expansion in capacity, accessibility, resolution, 66and signal-to-noise ratio, resulting in a recognised shift in our capability for using 67 remote-sensing technologies to monitor biophysical processes on land and water [1][2][3] . 68These advances are motivating calls to use Earth Observation products to manage our 69 natural environment and to track progress toward global and national policy targets on 70 biodiversity and ecosystem services [4][5][6] . Foremost among these policies are the Strategic 71Plan for Biodiversity and the Aichi Biodiversity Targets, which were adopted in 2010 by products (net primary productivity and fire incidence) that could serve as Essential 108Biodiversity Variables for the Sahara, despite this biome's suitability for remote sensing 109 due to its visible biodiversity hotspots, remoteness, and availability of long time series. 110Many of the Aichi Targets require data with species-level resolution, either because some 111 species are direct policy targets (e.g. Target 9: "invasive species controlled or eradicated") 112 or because species compositional data define the metric (e.g. Target 11: "protected areas 113 are ecologically representative and conserved effectively"). species, but information could be 'borrowed' from data-rich species to increase the 294 precision of predictions for rare species. These procedures were able to compensate for 295 the fact that only 134 total bird species had been detected in the survey, which is less The GDM was parameterised with a training dataset of 2280 surveys and fourteen 303 environmental variables and explained 57% of the variation in beta diversity. In addition, for linking pure-EO data to biodiversity. 382The major remaining components of uncertainty relate to generalisability, because only a 383 single FSC-certified reserve was sampled; the applicability of results to arboreal species, 384 which tend to be detected more frequently in forests with disturbed canopy but are not 385 necessarily more widespread in these forests; and wide confidence intervals around 386 parameter estimates for some species as a consequence of sparse data and a fairly 394Another example of the CEOBE approach is the use of Generalised Dissimilarity 395Modelling to connect EO-derived metrics of habitat degradation and fragmentation 89,90 396 to over 300 million records of more ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.