“…Taken together, these protein coding probability data are consistent with previous studies that have suggested caution is warranted when extrapolating genome sequence analyses to infer TE-CDS exaptation events [ 13 , 16 , 35 , 36 ]. In particular, the notion that non-autonomous TEs that do not encode any protein, including SINEs such as the Alu family of elements, can emerge as protein coding sequences after being incorporated into exons has been directly challenged [ 13 ].…”
Section: Resultssupporting
confidence: 89%
“…TE sequences that are spliced into mRNAs, actually encode protein sequences. However, this assumption has been challenged on several different fronts [ 13 , 16 , 35 , 36 ]. In particular, it is unclear whether non-autonomous TEs that do not encode any protein, such as Alu elements, actually provide protein coding sequences after becoming exonized [ 13 ].…”
Background: Transposable element (TE) sequences, once thought to be merely selfish or parasitic members of the genomic community, have been shown to contribute a wide variety of functional sequences to their host genomes. Analysis of complete genome sequences have turned up numerous cases where TE sequences have been incorporated as exons into mRNAs, and it is widely assumed that such 'exonized' TEs encode protein sequences. However, the extent to which TE-derived sequences actually encode proteins is unknown and a matter of some controversy. We have tried to address this outstanding issue from two perspectives: i-by evaluating ascertainment biases related to the search methods used to uncover TE-derived protein coding sequences (CDS) and ii-through a probabilistic codon-frequency based analysis of the protein coding potential of TE-derived exons.
“…Taken together, these protein coding probability data are consistent with previous studies that have suggested caution is warranted when extrapolating genome sequence analyses to infer TE-CDS exaptation events [ 13 , 16 , 35 , 36 ]. In particular, the notion that non-autonomous TEs that do not encode any protein, including SINEs such as the Alu family of elements, can emerge as protein coding sequences after being incorporated into exons has been directly challenged [ 13 ].…”
Section: Resultssupporting
confidence: 89%
“…TE sequences that are spliced into mRNAs, actually encode protein sequences. However, this assumption has been challenged on several different fronts [ 13 , 16 , 35 , 36 ]. In particular, it is unclear whether non-autonomous TEs that do not encode any protein, such as Alu elements, actually provide protein coding sequences after becoming exonized [ 13 ].…”
Background: Transposable element (TE) sequences, once thought to be merely selfish or parasitic members of the genomic community, have been shown to contribute a wide variety of functional sequences to their host genomes. Analysis of complete genome sequences have turned up numerous cases where TE sequences have been incorporated as exons into mRNAs, and it is widely assumed that such 'exonized' TEs encode protein sequences. However, the extent to which TE-derived sequences actually encode proteins is unknown and a matter of some controversy. We have tried to address this outstanding issue from two perspectives: i-by evaluating ascertainment biases related to the search methods used to uncover TE-derived protein coding sequences (CDS) and ii-through a probabilistic codon-frequency based analysis of the protein coding potential of TE-derived exons.
“…Consistent with the conservative approach of Pavlicek et al, a more recent publication from the Nekrutenko group refuted one of their own earlier discoveries of a mouse CDS that appeared to be derived almost entirely from SINEs [47]. Comparative sequence analysis with other Mus species, as well as the rat, did not find any evidence for the conservation of the ORF of this TE-derived gene.…”
The activity of transposable elements (TEs) has had a profound impact on the evolution of eukaryotic genomes. Once thought to be purely selfish genomic entities, TEs are now recognized to occupy a continuum of relationships, ranging from parasitic to mutualistic, with their host genomes. One of the many ways that TEs contribute to the function and evolution of the genomes in which they reside is through the donation of host protein coding sequences (CDSs). In this chapter, we will describe several notable examples of eukaryotic host CDSs that are derived from TEs. Despite the existence of a number of such well-established cases, the overall extent and significance of this phenomenon remains a matter of controversy. Genome-scale computational analyses have yielded vastly different estimates for the fraction of host CDSs that are derived from TEs. We explain how these seemingly contradictory findings are the result of specific ascertainment biases introduced by the different methods used to detect TE-related sequences. In light of this problem, we propose a comprehensive and systematic framework for definitively characterizing the contribution of TEs to eukaryotic CDSs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.