2020
DOI: 10.1186/s12864-020-07033-8
|View full text |Cite
|
Sign up to set email alerts
|

Accurate prediction of DNA N4-methylcytosine sites via boost-learning various types of sequence features

Abstract: Background DNA N4-methylcytosine (4mC) is a critical epigenetic modification and has various roles in the restriction-modification system. Due to the high cost of experimental laboratory detection, computational methods using sequence characteristics and machine learning algorithms have been explored to identify 4mC sites from DNA sequences. However, state-of-the-art methods have limited performance because of the lack of effective sequence features and the ad hoc choice of learning algorithms to cope with thi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 24 publications
(12 citation statements)
references
References 44 publications
0
10
0
Order By: Relevance
“…The benchmark and independent datasets in this paper are collected from different sources [ 3 , 5 , 6 , 8 , 9 , 16 , 17 , 18 , 19 ] to measure the efficiency of the proposed model for a fair comparison with current predictors. These datasets contain eight different species, namely Caenorhabditis elegans (C. elegans) , Drosophila melanogaster (D. melanogaster) , Arabidopsis thaliana (A. thaliana) , Escherichia coli (E. coli) , Geoalkalibacter subterraneus (G. subterraneus) , Geobacter pickeringii (G. pickeringi) , Fragaria vesca (F. vesca) , and Rosa chinensis (R. chinensis) .…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The benchmark and independent datasets in this paper are collected from different sources [ 3 , 5 , 6 , 8 , 9 , 16 , 17 , 18 , 19 ] to measure the efficiency of the proposed model for a fair comparison with current predictors. These datasets contain eight different species, namely Caenorhabditis elegans (C. elegans) , Drosophila melanogaster (D. melanogaster) , Arabidopsis thaliana (A. thaliana) , Escherichia coli (E. coli) , Geoalkalibacter subterraneus (G. subterraneus) , Geobacter pickeringii (G. pickeringi) , Fragaria vesca (F. vesca) , and Rosa chinensis (R. chinensis) .…”
Section: Methodsmentioning
confidence: 99%
“…Z. Zhao et al (2020) [ 17 ] proposed a model of DNA N4-methylcytosine sites via boost-learning various types of sequence features. The DNA sequence was first encoded with one-hot binary (OHB), sequential nucleotide frequency (SNF), K-nucleotide frequency (KNF), K-spectrum nucleotide pair frequency (KSNPF), and PseDNC.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Despite significant studies on 5mC and 3mC that have been carried out, work on 4mC is still in the rudimentary phase [24]. 4mC is primarily present in prokaryotes however, current high-sensitivity methods can detect it in eukaryotes also, such as single-molecule real-time sequencing, [25].…”
Section: Origin and Function Of Modified Methylcytosines In Dnamentioning
confidence: 99%
“…It has been extensively detected in viruses, bacteria, protists, fungi, algae, etc. As the other important epigenetic modification, 4mC protects host DNA from the degradation of restriction enzymes and corrects prokaryotic DNA replication errors, and controls the DNA replication and cell cycle of prokaryotes [ 7 ]. Therefore, DNA methylation identification is fundamentally essential for revealing the functional mechanisms.…”
Section: Introductionmentioning
confidence: 99%