DNA barcoding has become a promising means for identifying organisms of all life stages. Currently, phenetic approaches and tree-building methods have been used to define species boundaries and discover 'cryptic species'. However, a universal threshold of genetic distance values to distinguish taxonomic groups cannot be determined. As an alternative, DNA barcoding approaches can be 'character based', whereby species are identified through the presence or absence of discrete nucleotide substitutions (character states) within a DNA sequence. We demonstrate the potential of character-based DNA barcodes by analysing 833 odonate specimens from 103 localities belonging to 64 species. A total of 54 species and 22 genera could be discriminated reliably through unique combinations of character states within only one mitochondrial gene region (NADH dehydrogenase 1). Character-based DNA barcodes were further successfully established at a population level discriminating seven population-specific entities out of a total of 19 populations belonging to three species. Thus, for the first time, DNA barcodes have been found to identify entities below the species level that may constitute separate conservation units or even species units. Our findings suggest that character-based DNA barcoding can be a rapid and reliable means for (i) the assignment of unknown specimens to a taxonomic group, (ii) the exploration of diagnosability of conservation units, and (iii) complementing taxonomic identification systems.
The success of character based DNA barcoding depends on the efficient identification of diagnostic character states from molecular sequences that have been organized hierarchically (e.g., according to phylogenetic methods). Similarly, the reliability of these identified diagnostic character states must be assessed according to their ability to diagnose new sequences. Here, a set of software tools is presented that implement the previously described Characteristic Attribute Organization System for both diagnostic identification and diagnostic-based classification. The software is publicly available from http://sarkarlab.mbl.edu/CAOS. Introduction DNA barcoding initiatives have, to date, relied heavily on distance based and computationally intensive tree-based methods for both diagnostic identification and later classification of new data. Alternatively, character-based methods can be used to identify classification rules based on an existing hierarchical organization, and then rapidly classify new data without requiring intensive phylogenetic approaches. The utility of a character based approach to DNA barcoding has been discussed in a theoretical context and in the context of DNA barcoding's relevance to classical taxonomy (Prendini 2005; DeSalle et al. 2005;DeSalle, 2006;Rubinoff, 2006a;2006b;Williams and Ebach, 2006;Little and Stevenson, 2007). Character based approaches have also been shown to be feasible and effective in recent publications (Rach et al., 2008;Kelly et al., 2007). However, an operational approach to, and software for, character based DNA barcoding has been lacking.
Over the past two decades, there has been a long-standing debate about the impact of taxon sampling on phylogenetic inference. Studies have been based on both real and simulated data sets, within actual and theoretical contexts, and using different inference methods, to study the impact of taxon sampling. In some cases, conflicting conclusions have been drawn for the same data set. The main questions explored in studies to date have been about the effects of using sparse data, adding new taxa, including more characters from genome sequences and using different (or concatenated) locus regions. These questions can be reduced to more fundamental ones about the assessment of data quality and the design guidelines of taxon sampling in phylogenetic inference experiments. This review summarizes progress to date in understanding the impact of taxon sampling on the accuracy of phylogenetic analysis.
A detailed phylogenetic analysis of tetraspanins from 10 fully sequenced metazoan genomes and several fungal and protist genomes gives insight into their evolutionary origins and organization. Our analysis suggests that the superfamily can be divided into four large families. These four families-the CD family, CD63 family, uroplakin family, and RDS family-are further classified as consisting of several ortholog groups. The clustering of several ortholog groups together, such as the CD9/Tsp2/CD81 cluster, suggests functional relatedness of those ortholog groups. The fact that our studies are based on whole genome analysis enabled us to estimate not only the phylogenetic relationships among the tetraspanins, but also the first appearance in the tree of life of certain tetraspanin ortholog groups. Taken together, our data suggest that the tetraspanins are derived from a single (or a few) ancestral gene(s) through sequence divergence, rather than convergence, and that the majority of tetraspanins found in the human genome are vertebrate (21 instances), tetrapod (4 instances), or mammalian (6 instances) inventions.
A variety of bioactive proteins from medicinal leeches, like species of Hirudo , have been characterized and evaluated for their potential therapeutic biomedical properties. However, there has not previously been a comprehensive attempt to fully characterize the salivary transcriptome of a medicinal leech that would allow a clearer understanding of the suite of polypeptides employed by these sanguivorous annelids and provide insights regarding their evolutionary origins. An Expressed Sequence Tag (EST) library-based analysis of the salivary transcriptome of the North American medicinal leech, Macrobdella decora, reveals a complex cocktail of anticoagulants and other bioactive secreted proteins not previously known to exist in a single leech. Transcripts were identified that correspond to each of saratin, bdellin, destabilase, hirudin, decorsin, endoglucoronidase, antistatin, and eglin, as well as to other previously uncharacterized predicted serine protease inhibitors, lectoxin-like c-type lectins, ficolin, disintegrins and histidine-rich proteins. This work provides a lens into the richness of bioactive polypeptides that are associated with sanguivory. In the context of a well-characterized molecular phylogeny of leeches, the results allow for preliminary evaluation of the relative evolutionary origins and historical conservation of leech salivary components. The goal of identifying evolutionarily significant residues associated with biomedically significant phenomena implies continued insights from a broader sampling of blood-feeding leech salivary transcriptomes.
http://nypg.bio.nyu.edu/orthologid/
Incorporating substantial intraspecific genetic variation for 19 species from 131 individual chitons, genus Mopalia (Mollusca: Polyplacophora), we present rigorous DNA barcodes for this genus as per the currently accepted approaches to DNA barcoding. We also have performed a second kind of analysis that does not rely on blast or the distance-based neighbour-joining approach as currently resides on the Barcode of Life Data Systems website. Our character-based approach, called characteristic attribute organization system, returns fast, accurate, character-based diagnostics and can unambiguously distinguish between even closely related species based on these diagnostics. Using statistical subsampling approaches with our original data matrix, we show that the method outperforms blast and is equally effective as the neighbour-joining approach. Our approach differs from the neighbour-joining approach in that the end-product is a list of diagnostic nucleotide positions that can be used in descriptions of species. In addition, the diagnostics obtained from this character-based approach can be used to design oligonucleotides for detection arrays, polymerase chain reaction drop off diagnostics, TaqMan assays, and design of primers for generating short fragments that encompass regions containing diagnostics in the cytochrome oxidase I gene.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.