2022
DOI: 10.3389/fmicb.2022.843425
|View full text |Cite
|
Sign up to set email alerts
|

Systematic Analysis and Accurate Identification of DNA N4-Methylcytosine Sites by Deep Learning

Abstract: DNA N4-methylcytosine (4mC) is a pivotal epigenetic modification that plays an essential role in DNA replication, repair, expression and differentiation. To gain insight into the biological functions of 4mC, it is critical to identify their modification sites in the genomics. Recently, deep learning has become increasingly popular in recent years and frequently employed for the 4mC site identification. However, a systematic analysis of how to build predictive models using deep learning techniques is still lack… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 69 publications
(81 reference statements)
0
6
0
Order By: Relevance
“…In TNC, all samples of 41 nt produce 39 components with the equation of L − k + 1. Here, L stands for the sequence length, and k stands for the K-mer value as an integer [ 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 ]. ATG, TGC, GCG, and CGA are the four 3-mers that can be tokenized from the DNA sequence “ATGCGA,” for instance.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…In TNC, all samples of 41 nt produce 39 components with the equation of L − k + 1. Here, L stands for the sequence length, and k stands for the K-mer value as an integer [ 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 ]. ATG, TGC, GCG, and CGA are the four 3-mers that can be tokenized from the DNA sequence “ATGCGA,” for instance.…”
Section: Methodsmentioning
confidence: 99%
“…Yu. Lezheng et al (2022) [ 11 ] proposed a convolutional recurrent neural network model to identify DNA N4-methylcytosine. For representing DNA sequences, they considered one-hot and dictionary encoding methods.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Since 2017, over 30 publications aimed at characterizing 4mC distribution in eukaryotic genomes were devoted to continuous improvement of 4mC prediction accuracy using machine learning approaches ( [67,68] and references therein). However, the value of these methods is uncertain, as they were developed and trained on reference datasets [69,70] which have not been rigorously proven to harbor the 4mC mark.…”
Section: Challenges In Assignment and Validation Of Eukaryotic Methyl...mentioning
confidence: 99%
“…Therefore, throughout the use of autoBioSeqpy, the users only need to provide the data sets and the model architecture codes. Recently, we have applied autoBioSeqpy to practice with good results, e.g., druggable protein prediction, 37 DL algorithm development for DNA N 4 methylcytosine, 38 anti-cancer peptide design, 39 and the distinction between bacterial type III and IV secreted effectors. 40 Hence, the present study will utilize autoBioSeqpy for the rapid and efficient creation, training, and utilization of DL models for the detection of m 7 G sites.…”
Section: ■ Introductionmentioning
confidence: 99%