IntroductionThe PCR-based analysis of homologous genes has become one of the most powerful approaches for species detection and identification, particularly with the recent availability of Next Generation Sequencing platforms (NGS) making it possible to identify species composition from a broad range of environmental samples. Identifying species from these samples relies on the ability to match sequences with reference barcodes for taxonomic identification. Unfortunately, most studies of environmental samples have targeted ribosomal markers, despite the fact that the mitochondrial Cytochrome c Oxidase subunit I gene (COI) is by far the most widely available sequence region in public reference libraries. This is largely because the available versatile (“universal”) COI primers target the 658 barcoding region, whose size is considered too large for many NGS applications. Moreover, traditional barcoding primers are known to be poorly conserved across some taxonomic groups.ResultsWe first design a new PCR primer within the highly variable mitochondrial COI region, the “mlCOIintF” primer. We then show that this newly designed forward primer combined with the “jgHCO2198” reverse primer to target a 313 bp fragment performs well across metazoan diversity, with higher success rates than versatile primer sets traditionally used for DNA barcoding (i.e. LCO1490/HCO2198). Finally, we demonstrate how the shorter COI fragment coupled with an efficient bioinformatics pipeline can be used to characterize species diversity from environmental samples by pyrosequencing. We examine the gut contents of three species of planktivorous and benthivorous coral reef fish (family: Apogonidae and Holocentridae). After the removal of dubious COI sequences, we obtained a total of 334 prey Operational Taxonomic Units (OTUs) belonging to 14 phyla from 16 fish guts. Of these, 52.5% matched a reference barcode (>98% sequence similarity) and an additional 32% could be assigned to a higher taxonomic level using Bayesian assignment.ConclusionsThe molecular analysis of gut contents targeting the 313 COI fragment using the newly designed mlCOIintF primer in combination with the jgHCO2198 primer offers enormous promise for metazoan metabarcoding studies. We believe that this primer set will be a valuable asset for a range of applications from large-scale biodiversity assessments to food web studies.
Until now the most efficient solution to align nucleotide sequences containing open reading frames was to use indirect procedures that align amino acid translation before reporting the inferred gap positions at the codon level. There are two important pitfalls with this approach. Firstly, any premature stop codon impedes using such a strategy. Secondly, each sequence is translated with the same reading frame from beginning to end, so that the presence of a single additional nucleotide leads to both aberrant translation and alignment.We present an algorithm that has the same space and time complexity as the classical Needleman-Wunsch algorithm while accommodating sequencing errors and other biological deviations from the coding frame. The resulting pairwise coding sequence alignment method was extended to a multiple sequence alignment (MSA) algorithm implemented in a program called MACSE (Multiple Alignment of Coding SEquences accounting for frameshifts and stop codons). MACSE is the first automatic solution to align protein-coding gene datasets containing non-functional sequences (pseudogenes) without disrupting the underlying codon structure. It has also proved useful in detecting undocumented frameshifts in public database sequences and in aligning next-generation sequencing reads/contigs against a reference coding sequence.MACSE is distributed as an open-source java file executable with freely available source code and can be used via a web interface at: http://mbb.univ-montp2.fr/macse.
We have sequenced the genome of the emerging human pathogen Babesia microti and compared it with that of other protozoa. B. microti has the smallest nuclear genome among all Apicomplexan parasites sequenced to date with three chromosomes encoding ∼3500 polypeptides, several of which are species specific. Genome-wide phylogenetic analyses indicate that B. microti is significantly distant from all species of Babesidae and Theileridae and defines a new clade in the phylum Apicomplexa. Furthermore, unlike all other Apicomplexa, its mitochondrial genome is circular. Genome-scale reconstruction of functional networks revealed that B. microti has the minimal metabolic requirement for intraerythrocytic protozoan parasitism. B. microti multigene families differ from those of other protozoa in both the copy number and organization. Two lateral transfer events with significant metabolic implications occurred during the evolution of this parasite. The genomic sequencing of B. microti identified several targets suitable for the development of diagnostic assays and novel therapies for human babesiosis.
The origin, evolution, and functional relevance of genomic variations in GC content are a long-debated topic, especially in mammals. Most of the existing literature, however, has focused on a small number of model species and/or limited sequence data sets. We analyzed more than 1000 orthologous genes in 33 fully sequenced mammalian genomes, reconstructed their ancestral isochore organization in the maximum likelihood framework, and explored the evolution of third-codon position GC content in representatives of 16 orders and 27 families. We showed that the previously reported erosion of GC-rich isochores is not a general trend. Several species (e.g., shrew, microbat, tenrec, rabbit) have independently undergone a marked increase in GC content, with a widening gap between the GC-poorest and GC-richest classes of genes. The intensively studied apes and (especially) murids do not reflect the general placental pattern. We correlated GC-content evolution with species life-history traits and cytology. Significant effects of body mass and genome size were detected, with each being consistent with the GC-biased gene conversion model.
International audienceTree reconciliation methods aim at estimating the evolutionary events that cause discrepancy between gene trees and species trees. We provide a discrete computational model that considers duplications, transfers and losses of genes. The model yields a fast and exact algorithm to infer time consistent and most parsimonious reconciliations. Then we study the conditions under which parsimony is able to accurately infer such events. Overall, it performs well even under realistic rates, transfers being in general less accurately recovered than duplications. An implementation is freely available at http://www.atgc-montpellier.fr/MPR
Multiple sequence alignment is a prerequisite for many evolutionary analyses. Multiple Alignment of Coding Sequences (MACSE) is a multiple sequence alignment program that explicitly accounts for the underlying codon structure of protein-coding nucleotide sequences. Its unique characteristic allows building reliable codon alignments even in the presence of frameshifts. This facilitates downstream analyses such as selection pressure estimation based on the ratio of nonsynonymous to synonymous substitutions. Here, we present MACSE v2, a major update with an improved version of the initial algorithm enriched with a complete toolkit to handle multiple alignments of protein-coding sequences. A graphical interface now provides user-friendly access to the different subprograms.
Laonastes aenigmamus is an enigmatic rodent first described in 2005. Molecular and morphological data suggested that it is the sole representative of a new mammalian family, the Laonastidae, and a member of the Hystricognathi. However, the validity of this family is controversial because fossil-based phylogenetic analyses suggest that Laonastes is a surviving member of the Diatomyidae, a family considered to have been extinct for 11 million years. According to these data, Laonastes and Diatomyidae are the sister clade of extant Ctenodactylidae (i.e., gundies) and do not belong to the Hystricognathi. To solve the phylogenetic position of Laonastes, we conducted a large-scale molecular phylogeny of rodents. The analysis includes representatives of all major rodent taxonomic groups and was based on 5.5 kb of sequence data from four nuclear and two mitochondrial genes. To further validate the obtained results, a short interspersed element insertion analysis including 11 informative loci was also performed. Our molecular data based on sequence and short interspersed element analyses unambiguously placed Laonastes as a sister clade of gundies. All alternative hypotheses were significantly rejected based on ShimodairaHasegawa tests, supporting the idea that Laonastes does not belong to the Hystricognathi. Molecular dating analysis also supports an ancient divergence, Ϸ44 Mya ago, between Ctenodactylidae and Laonastes. These combined analyses support the hypothesis that Laonastes is indeed a living fossil. Protection of this surviving species would conserve an ancient mammalian family.Laonastes aenigmamus ͉ molecular phylogeny ͉ rodent ͉ retroposons
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.