2018
DOI: 10.1371/journal.pcbi.1006097
|View full text |Cite
|
Sign up to set email alerts
|

A machine learning based framework to identify and classify long terminal repeat retrotransposons

Abstract: Transposable elements (TEs) are repetitive nucleotide sequences that make up a large portion of eukaryotic genomes. They can move and duplicate within a genome, increasing genome size and contributing to genetic diversity within and across species. Accurate identification and classification of TEs present in a genome is an important step towards understanding their effects on genes and their role in genome evolution. We introduce TE-Learner, a framework based on machine learning that automatically identifies T… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 28 publications
(25 citation statements)
references
References 30 publications
0
25
0
Order By: Relevance
“…In recent years, much bioinformatics software has been developed to detect TEs (Girgis, 2015) and, although they follow different strategies (such as homology-based, structure-based, de novo, and using comparative genomics), these lack sensitivity and specificity due to the polymorphic structures of TEs (Su, Gu & Peterson, 2019). Loureiro et al (Loureiro et al, 2013a) proved that ML could be used to improve the accuracy of TEs detection by combining results obtained by several conventional software and training a classifier using these results (Schietgat et al, 2018), (Loureiro et al, 2013b). Loureiro's work provided novel evidence for the use of ML in TEs, yet it did not use ML to obtain the predictions, making the results too dependent on traditional algorithms.…”
Section: Benefits Of ML Over Bioinformatics (Q1)mentioning
confidence: 99%
See 3 more Smart Citations
“…In recent years, much bioinformatics software has been developed to detect TEs (Girgis, 2015) and, although they follow different strategies (such as homology-based, structure-based, de novo, and using comparative genomics), these lack sensitivity and specificity due to the polymorphic structures of TEs (Su, Gu & Peterson, 2019). Loureiro et al (Loureiro et al, 2013a) proved that ML could be used to improve the accuracy of TEs detection by combining results obtained by several conventional software and training a classifier using these results (Schietgat et al, 2018), (Loureiro et al, 2013b). Loureiro's work provided novel evidence for the use of ML in TEs, yet it did not use ML to obtain the predictions, making the results too dependent on traditional algorithms.…”
Section: Benefits Of ML Over Bioinformatics (Q1)mentioning
confidence: 99%
“…Loureiro's work provided novel evidence for the use of ML in TEs, yet it did not use ML to obtain the predictions, making the results too dependent on traditional algorithms. Using the Random Forest algorithm, Schietgat et al were able to improve results obtained by popular bioinformatics software (which followed a homology-based strategy) such as Censor, RepeatMasker, and LTRDigest (Schietgat et al, 2018) in the detection of LTR retrotransposons. The authors proposed a framework called TE-Learner LTR , which outperformed LTRDigest in recall and RepeatMasker and Censor in terms of precision.…”
Section: Benefits Of ML Over Bioinformatics (Q1)mentioning
confidence: 99%
See 2 more Smart Citations
“…Due to the high diversity of TE structures and transposition mechanisms, there are still numerous classification problems and debates on the classification systems (Piégu et al, 2015). TEs in eukaryotes are traditionally classified based on if the reverse transcription is needed for transposition (Class I or retrotransposons) or not (Class II or DNA transposons) (Schietgat et al, 2018). Retrotransposons can be further subclassified into four orders according to structural features and the life cycle of the element.…”
Section: Introductionmentioning
confidence: 99%