High-throughput RNA sequencing (RNA-seq) is considered a powerful tool for novel gene discovery and fine-tuned transcriptional profiling. The digital nature of RNA-seq is also believed to simplify meta-analysis and to reduce background noise associated with hybridization-based approaches. The development of multiplex sequencing enables efficient and economic parallel analysis of gene expression. In addition, RNA-seq is of particular value when low RNA expression or modest changes between samples are monitored. However, recent data uncovered severe bias in the sequencing of small non-protein coding RNA (small RNA-seq or sRNA-seq), such that the expression levels of some RNAs appeared to be artificially enhanced and others diminished or even undetectable. The use of different adapters and barcodes during ligation as well as complex RNA structures and modifications drastically influence cDNA synthesis efficacies and exemplify sources of bias in deep sequencing. In addition, variable specific RNA G/C-content is associated with unequal polymerase chain reaction amplification efficiencies. Given the central importance of RNA-seq to molecular biology and personalized medicine, we review recent findings that challenge small non-protein coding RNA-seq data and suggest approaches and precautions to overcome or minimize bias.
Nonprotein-coding RNAs (npcRNAs) represent an important class of regulatory molecules that act in many cellular pathways. Here, we describe the experimental identification and validation of the small npcRNA transcriptome of the human malaria parasite Plasmodium falciparum. We identified 630 novel npcRNA candidates. Based on sequence and structural motifs, 43 of them belong to the C/D and H/ACA-box subclasses of small nucleolar RNAs (snoRNAs) and small Cajal body-specific RNAs (scaRNAs). We further observed the exonization of a functional H/ACA snoRNA gene, which might contribute to the regulation of ribosomal protein L7a gene expression. Some of the small npcRNA candidates are from telomeric and subtelomeric repetitive regions, suggesting their potential involvement in maintaining telomeric integrity and subtelomeric gene silencing. We also detected 328 cis-encoded antisense npcRNAs (asRNAs) complementary to P. falciparum protein-coding genes of a wide range of biochemical pathways, including determinants of virulence and pathology. All cis-encoded asRNA genes tested exhibit lifecycle-specific expression profiles. For all but one of the respective sense–antisense pairs, we deduced concordant patterns of expression. Our findings have important implications for a better understanding of gene regulatory mechanisms in P. falciparum, revealing an extended and sophisticated npcRNA network that may control the expression of housekeeping genes and virulence factors.
New deep RNA sequencing methodologies in transcriptome analyses identified a wealth of novel nonprotein-coding RNAs (npcRNAs). Recently, deep sequencing was used to delineate the small npcRNA transcriptome of the human pathogen Vibrio cholerae and 627 novel npcRNA candidates were identified. Here, we report the detection of 223 npcRNA candidates in V. cholerae by different cDNA library construction and conventional sequencing methods. Remarkably, only 39 of the candidates were common to both surveys. We therefore examined possible biasing influences in the transcriptome analyses. Key steps, including tailing and adapter ligations for generating cDNA, contribute qualitatively and quantitatively to the discrepancies between data sets. In addition, the state of 59-end phosphorylation influences the efficiency of adapter ligation and C-tailing at the 39-end of the RNA. Finally, our data indicate that the inclusion of sample-specific molecular identifier sequences during ligation steps also leads to biases in cDNA representation. In summary, even deep sequencing is unlikely to identify all RNA species, and caution should be used for meta-analyses among alternatively generated data sets.
Circular RNAs (circRNAs) are an emerging class of RNA molecules that have been linked to human diseases and important regulatory pathways. Their functional roles are still under investigation, often hampered by inefficient circRNA formation in and ex vivo . We generated an intron-mediated enhancement (IME) system that—in comparison to previously published methods—increases circRNA formation up to 5-fold. This strategy also revealed previously undetected translation of circRNA, e.g., circRtn4. Substantiated by Western blots and mass spectrometry we showed that in mammalian cells, translation of circRtn4 containing a potential “infinite” circular reading frame resulted in “monomers” and extended proteins, presumably “multimer” tandem repeats. In order to achieve high levels of circRNA formation and translation of other natural or recombinant circRNAs, we constructed a versatile circRNA expression vector—pCircRNA-DMo. We demonstrated the general applicability of this method by efficiently generating two additional circRNAs exhibiting high expression levels. The circRNA expression vector will be an important tool to investigate different aspects of circRNA biogenesis and to gain insights into mechanisms of circular RNA translation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.