2019
DOI: 10.7717/peerj.8311
|View full text |Cite
|
Sign up to set email alerts
|

A systematic review of the application of machine learning in the detection and classification of transposable elements

Abstract: Background Transposable elements (TEs) constitute the most common repeated sequences in eukaryotic genomes. Recent studies demonstrated their deep impact on species diversity, adaptation to the environment and diseases. Although there are many conventional bioinformatics algorithms for detecting and classifying TEs, none have achieved reliable results on different types of TEs. Machine learning (ML) techniques can automatically extract hidden patterns and novel information from labeled or non-labeled data and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
23
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3
1

Relationship

4
5

Authors

Journals

citations
Cited by 26 publications
(23 citation statements)
references
References 62 publications
(133 reference statements)
0
23
0
Order By: Relevance
“…In this study, we compare the performance of the most commonly used ML and DL algorithms in bioinformatics (Orozco-Arias et al, 2019a) in the task of classifying by supervised and unsupervised techniques. We used the 11_Tumor database and applied different preprocessing strategies.…”
Section: Introductionmentioning
confidence: 99%
“…In this study, we compare the performance of the most commonly used ML and DL algorithms in bioinformatics (Orozco-Arias et al, 2019a) in the task of classifying by supervised and unsupervised techniques. We used the 11_Tumor database and applied different preprocessing strategies.…”
Section: Introductionmentioning
confidence: 99%
“…TE classification is generally performed hierarchically [ 16 ], whereby TEs are first divided into classes according to their replication cycle: Class I or retrotransposons, which follow a copy-and-paste strategy using an RNA intermediate; and Class II or DNA transposons that use a cut-and-paste mobility mechanism through a DNA molecule [ 17 ]. Next, TE levels correspond to orders, superfamilies, lineages (also called families), and sub-families [ 18 ]. Among these, long terminal repeat (LTR) retrotransposons (LTR-RTs, an order of retrotransposons) are the most abundant TEs in plant genomes [ 19 , 20 ] and can account for up to 80% of the plant genome size, such as in wheat, barley, or rubber tree [ 21 ].…”
Section: Introductionmentioning
confidence: 99%
“…These mobile elements can accumulate large copy numbers in their host genomes [4] and have been found in all organisms. The majority of the nuclear DNA content of large genomes is composed of TEs, such as in wheat, barley, and maize [5][6][7] for plants. In humans, these elements (or TE-derived sequences) comprise~50-70% of the sequenced genome [8].…”
Section: Introductionmentioning
confidence: 99%
“…In humans, these elements (or TE-derived sequences) comprise~50-70% of the sequenced genome [8]. Several studies have indicated that TEs play crucial genomic roles involved in chromosome structuring, structural variation, the alteration of gene expression [5,7], evolution, the variation of genomic size, and environmental adaptation [9][10][11][12][13]. Nevertheless, these elements can also be associated with human diseases, such as different types of cancer [14][15][16].…”
Section: Introductionmentioning
confidence: 99%