Premise of the study:Plann automates the process of annotating a plastome sequence in GenBank format for either downstream processing or for GenBank submission by annotating a new plastome based on a similar, well-annotated plastome.Methods and Results:Plann is a Perl script to be executed on the command line. Plann compares a new plastome sequence to the features annotated in a reference plastome and then shifts the intervals of any matching features to the locations in the new plastome. Plann’s output can be used in the National Center for Biotechnology Information’s tbl2asn to create a Sequin file for GenBank submission.Conclusions:Unlike Web-based annotation packages, Plann is a locally executable script that will accurately annotate a plastome sequence to a locally specified reference plastome. Because it executes from the command line, it is ready to use in other software pipelines and can be easily rerun as a draft plastome is improved.
SummaryAs molecular phylogenetic analyses incorporate ever-greater numbers of loci, cases of cytonuclear discordance -the phenomenon in which nuclear gene trees deviate significantly from organellar gene trees -are being reported more frequently. Plant examples of topological discordance, caused by recent hybridization between extant species, are well known. However, examples of branch-length discordance are less reported in plants relative to animals.We use a combination of de novo assembly and reference-based mapping using short-read shotgun sequences to construct a robust phylogeny of the plastome for multiple individuals of all the common Populus species in North America.We demonstrate a case of strikingly high plastome divergence, in contrast to little nuclear genome divergence, in two closely related balsam poplars, Populus balsamifera and Populus trichocarpa (Populus balsamifera ssp. trichocarpa). Previous studies with nuclear loci indicate that the two species (or subspecies) diverged since the late Pleistocene, whereas their plastomes indicate deep divergence, dating to at least the Pliocene (6-7 Myr ago).Our finding is in marked contrast to the estimated Pleistocene divergence of the nuclear genomes, previously calculated at 75 000 yr ago, suggesting plastid capture from a 'ghost lineage' of a now-extinct North American poplar.
Novel sequencing technologies are rapidly expanding the size of data sets that can be applied to phylogenetic studies. Currently the most commonly used phylogenomic approaches involve some form of genome reduction. While these approaches make assembling phylogenomic data sets more economical for organisms with large genomes, they reduce the genomic coverage and thereby the long-term utility of the data. Currently, for organisms with moderate to small genomes ($<$1000 Mbp) it is feasible to sequence the entire genome at modest coverage ($10-30\times$). Computational challenges for handling these large data sets can be alleviated by assembling targeted reads, rather than assembling the entire genome, to produce a phylogenomic data matrix. Here we demonstrate the use of automated Target Restricted Assembly Method (aTRAM) to assemble 1107 single-copy ortholog genes from whole genome sequencing of sucking lice (Anoplura) and out-groups. We developed a pipeline to extract exon sequences from the aTRAM assemblies by annotating them with respect to the original target protein. We aligned these protein sequences with the inferred amino acids and then performed phylogenetic analyses on both the concatenated matrix of genes and on each gene separately in a coalescent analysis. Finally, we tested the limits of successful assembly in aTRAM by assembling 100 genes from close- to distantly related taxa at high to low levels of coverage.Both the concatenated analysis and the coalescent-based analysis produced the same tree topology, which was consistent with previously published results and resolved weakly supported nodes. These results demonstrate that this approach is successful at developing phylogenomic data sets from raw genome sequencing reads. Further, we found that with coverages above $5-10\times$, aTRAM was successful at assembling 80-90% of the contigs for both close and distantly related taxa. As sequencing costs continue to decline, we expect full genome sequencing will become more feasible for a wider array of organisms, and aTRAM will enable mining of these genomic data sets for an extensive variety of applications, including phylogenomics. [aTRAM; gene assembly; genome sequencing; phylogenomics.].
BackgroundAssembling genes from next-generation sequencing data is not only time consuming but computationally difficult, particularly for taxa without a closely related reference genome. Assembling even a draft genome using de novo approaches can take days, even on a powerful computer, and these assemblies typically require data from a variety of genomic libraries. Here we describe software that will alleviate these issues by rapidly assembling genes from distantly related taxa using a single library of paired-end reads: aTRAM, automated Target Restricted Assembly Method. The aTRAM pipeline uses a reference sequence, BLAST, and an iterative approach to target and locally assemble the genes of interest.ResultsOur results demonstrate that aTRAM rapidly assembles genes across distantly related taxa. In comparative tests with a closely related taxon, aTRAM assembled the same sequence as reference-based and de novo approaches taking on average < 1 min per gene. As a test case with divergent sequences, we assembled >1,000 genes from six taxa ranging from 25 – 110 million years divergent from the reference taxon. The gene recovery was between 97 – 99% from each taxon.ConclusionsaTRAM can quickly assemble genes across distantly-related taxa, obviating the need for draft genome assembly of all taxa of interest. Because aTRAM uses a targeted approach, loci can be assembled in minutes depending on the size of the target. Our results suggest that this software will be useful in rapidly assembling genes for phylogenomic projects covering a wide taxonomic range, as well as other applications. The software is freely available http://www.github.com/juliema/aTRAM.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0515-2) contains supplementary material, which is available to authorized users.
Microsatellite markers (N = 5) were developed for analysis of genetic variation in 15 populations of the columnar cactus Stenocereus stellatus, managed under traditional agriculture practices in central Mexico. Microsatellite diversity was analyzed within and among populations, between geographic regions, and among population management types to provide detailed insight into historical gene flow rates and population dynamics associated with domestication. Our results corroborate a greater diversity in populations managed by farmers compared with wild ones (HE = 0.64 vs. 0.55), but with regional variation between populations among regions. Although farmers propagated S. stellatus vegetatively in home gardens to diversify their stock, asexual recruitment also occurred naturally in populations where more marginal conditions have limited sexual recruitment, resulting in lower genetic diversity. Therefore, a clear-cut relationship between the occurrence of asexual recruitment and genetic diversity was not evident. Two managed populations adjacent to towns were identified as major sources of gene movement in each sampled region, with significant migration to distant as well as nearby populations. Coupled with the absence of significant bottlenecks, this suggests a mechanism for promoting genetic diversity in managed populations through long distance gene exchange. Cultivation of S. stellatus in close proximity to wild populations has led to complex patterns of genetic variation across the landscape that reflects the interaction of natural and cultural processes. As molecular markers become available for nontraditional crops and novel analysis techniques allow us to detect and evaluate patterns of genetic diversity, genetic studies provide valuable insights into managing crop genetic resources into the future against a backdrop of global change. Traditional agriculture systems play an important role in maintaining genetic diversity for plant species.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.