BackgroundTransposable Elements (TE) are mobile sequences that make up large portions of eukaryote genomes. The functions they play within the complex cellular architecture are still not clearly understood, but it is becoming evident that TE have a role in several physiological and pathological processes. In particular, it has been shown that TE transcription is necessary for the correct development of mice embryos and that their expression is able to finely modulate transcription of coding and non-coding genes. Moreover, their activity in the central nervous system (CNS) and other tissues has been correlated with the creation of somatic mosaicisms and with pathologies such as neurodevelopmental and neurodegenerative diseases as well as cancers.ResultsWe analyzed TE expression among different cell types of the Caenorhabditis elegans (C. elegans) early embryo asking if, where and when TE are expressed and whether their expression is correlated with genes playing a role in early embryo development. To answer these questions, we took advantage of a public C. elegans embryonic single-cell RNA-seq (sc-RNAseq) dataset and developed a bioinformatics pipeline able to quantify reads mapping specifically against TE, avoiding counting reads mapping on TE fragments embedded in coding/non-coding transcripts. Our results suggest that i) canonical TE expression analysis tools, which do not discard reads mapping on TE fragments embedded in annotated transcripts, may over-estimate TE expression levels, ii) Long Terminal Repeats (LTR) elements are mostly expressed in undifferentiated cells and might play a role in pluripotency maintenance and activation of the innate immune response, iii) non-LTR are expressed in differentiated cells, in particular in neurons and nervous system-associated tissues, and iv) DNA TE are homogenously expressed throughout the C. elegans early embryo development.ConclusionsTE expression appears finely modulated in the C. elegans early embryo and different TE classes are expressed in different cell types and stages, suggesting that TE might play diverse functions during early embryo development.
Summary
Transposable Elements (TEs) play key roles in crucial biological pathways. Therefore, several tools enabling the quantification of their expression were recently developed. However, many of the existing tools lack the capability to distinguish between the transcription of autonomously expressed TEs and TE fragments embedded in canonical coding/non-coding non-TE transcripts. Consequently, an apparent change in the expression of a given TE may simply reflect the variation in the expression of the transcripts containing TE-derived sequences. To overcome this issue, we have developed TEspeX, a pipeline for the quantification of TE expression at the consensus level. TEspeX uses Illumina RNA-seq short reads to quantify TE expression avoiding counting reads deriving from inactive TE fragments embedded in canonical transcripts.
Availability and Implementation
The tool is implemented in python3, distributed under the GNU General Public License (GPL) and available on Github at https://github.com/fansalon/TEspeX (Zenodo URL: https://doi.org/10.5281/zenodo.6800331).
Supplementary information
Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.