BackgroundObtaining complete gene structures is one major goal of genome assembly. Some gene regions are fragmented in low quality and high-quality assemblies. Therefore, new approaches are needed to recover gene regions. Genomes are widely transcribed, generating messenger and non-coding RNAs. These widespread transcripts can be used to scaffold genomes and complete transcribed regions.ResultsWe present P_RNA_scaffolder, a fast and accurate tool using paired-end RNA-sequencing reads to scaffold genomes. This tool aims to improve the completeness of both protein-coding and non-coding genes. After this tool was applied to scaffolding human contigs, the structures of both protein-coding genes and circular RNAs were almost completely recovered and equivalent to those in a complete genome, especially for long proteins and long circular RNAs. Tested in various species, P_RNA_scaffolder exhibited higher speed and efficiency than the existing state-of-the-art scaffolders. This tool also improved the contiguity of genome assemblies generated by current mate-pair scaffolding and third-generation single-molecule sequencing assembly.ConclusionsThe P_RNA_scaffolder can improve the contiguity of genome assembly and benefit gene prediction. This tool is available at http://www.fishbrowser.org/software/P_RNA_scaffolder.Electronic supplementary materialThe online version of this article (10.1186/s12864-018-4567-3) contains supplementary material, which is available to authorized users.
Teleosts have more types of chromatophores than other vertebrates and the genetic basis for pigmentation is highly conserved among vertebrates. Therefore, teleosts are important models to study the mechanism of pigmentation. Although functional genes and genetic variations of pigmentation have been studied, the mechanisms of different skin coloration remains poorly understood. The koi strain of common carp has various colors and patterns, making it a good model for studying the genetic basis of pigmentation. We performed RNA-sequencing for red skin and white skin and identified 62 differentially expressed genes (DEGs). Most of them were validated with RT-qPCR. The up-regulated DEGs in red skin were enriched in Kupffer’s vesicle development while the up-regulated DEGs in white skin were involved in cytoskeletal protein binding, sarcomere organization and glycogen phosphorylase activity. The distinct enriched activity might be associated with different structures and functions in erythrophores and iridophores. The DNA methylation levels of two selected DEGs inversely correlated with gene expression, indicating the participation of DNA methylation in the coloration. This expression characterization of red—white skin along with the accompanying transcriptome-wide expression data will be a useful resource for further studies of pigment cell biology.
Motivation: Recovering the gene structures is one of the important goals of genome assembly. In low-quality assemblies, and even some high-quality assemblies, certain gene regions are still incomplete; thus, novel scaffolding approaches are required to complete gene regions. Results: We developed an efficient and fast genome scaffolding method called PEP_scaffolder, using proteins to scaffold genomes. The pipeline aims to recover protein-coding gene structures. We tested the method on human contigs; using human UniProt proteins as guides, the improvement on N50 size was 17% increase with an accuracy of ∼97%. PEP_scaffolder improved the proportion of fully covered proteins among all proteins, which was close to the proportion in the finished genome. The method provided a high accuracy of 91% using orthologs of distant species. Tested on simulated fly contigs, PEP_scaffolder outperformed other scaffolders, with the shortest running time and the highest accuracy. Availability and Implementation: The software is freely available at http://www.fishbrowser.org/software/PEP_scaffolder/ Contact: lijt@cafs.ac.cn Supplementary information: Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.