The evolution in next-generation sequencing (NGS) technology has led to the development of many different assembly algorithms, but few of them focus on assembling the organelle genomes. These genomes are used in phylogenetic studies, food identification and are the most deposited eukaryotic genomes in GenBank. Producing organelle genome assembly from whole genome sequencing (WGS) data would be the most accurate and least laborious approach, but a tool specifically designed for this task is lacking. We developed a seed-and-extend algorithm that assembles organelle genomes from whole genome sequencing (WGS) data, starting from a related or distant single seed sequence. The algorithm has been tested on several new (Gonioctena intermedia and Avicennia marina) and public (Arabidopsis thaliana and Oryza sativa) whole genome Illumina data sets where it outperforms known assemblers in assembly accuracy and coverage. In our benchmark, NOVOPlasty assembled all tested circular genomes in less than 30 min with a maximum memory requirement of 16 GB and an accuracy over 99.99%. In conclusion, NOVOPlasty is the sole de novo assembler that provides a fast and straightforward extraction of the extranuclear genomes from WGS data in one circular high quality contig. The software is open source and can be downloaded at https://github.com/ndierckx/NOVOPlasty.
COLOMBOS is a database that integrates publicly available transcriptomics data for several prokaryotic model organisms. Compared to the previous version it has more than doubled in size, both in terms of species and data available. The manually curated condition annotation has been overhauled as well, giving more complete information about samples’ experimental conditions and their differences. Functionality-wise cross-species analyses now enable users to analyse expression data for all species simultaneously, and identify candidate genes with evolutionary conserved expression behaviour. All the expression-based query tools have undergone a substantial improvement, overcoming the limit of enforced co-expression data retrieval and instead enabling the return of more complex patterns of expression behaviour. COLOMBOS is freely available through a web application at http://colombos.net/. The complete database is also accessible via REST API or downloadable as tab-delimited text files.
Heteroplasmy, the existence of multiple mitochondrial haplotypes within an individual, has been studied across different scientific fields. Mitochondrial genome polymorphisms have been linked to multiple severe disorders and are of interest to evolutionary studies and forensic science. Before the development of massive parallel sequencing (MPS), most studies of mitochondrial genome variation were limited to short fragments and to heteroplasmic variants associated with a relatively high frequency (>10%). By utilizing ultra-deep sequencing, it has now become possible to uncover previously undiscovered patterns of intra-individual polymorphisms. Despite these technological advances, it is still challenging to determine the origin of the observed intra-individual polymorphisms. We therefore developed a new method that not only detects intra-individual polymorphisms within mitochondrial and chloroplast genomes more accurately, but also looks for linkage among polymorphic sites by assembling the sequence around each detected polymorphic site. Our benchmark study shows that this method is capable of detecting heteroplasmy more accurately than any method previously available and is the first tool that is able to completely or partially reconstruct the sequence for each mitochondrial haplotype (allele). The method is implemented in our open source software NOVOPlasty that can be downloaded at https://github.com/ndierckx/NOVOPlasty.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.