2019
DOI: 10.1093/bib/bbz048
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation of different computational methods on 5-methylcytosine sites identification

Abstract: 5-Methylcytosine (m5C) plays an extremely important role in the basic biochemical process. With the great increase of identified m5C sites in a wide variety of organisms, their epigenetic roles become largely unknown. Hence, accurate identification of m5C site is a key step in understanding its biological functions. Over the past several years, more attentions have been paid on the identification of m5C sites in multiple species. In this work, we firstly summarized the current progresses in computational predi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
106
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 115 publications
(107 citation statements)
references
References 77 publications
1
106
0
Order By: Relevance
“…Moreover, we also used the area under the ROC curve (AUC) is to quantitively measure the predictive performance of the model (Yang et al, 2018;Lv et al, 2019b;Niu et al, 2019). A higher AUC represents a better predictor (Hanley and McNeil, 1982;Liu et al, 2018;Feng et al, 2019;Lai et al, 2019).…”
Section: Performance Indicatorsmentioning
confidence: 99%
“…Moreover, we also used the area under the ROC curve (AUC) is to quantitively measure the predictive performance of the model (Yang et al, 2018;Lv et al, 2019b;Niu et al, 2019). A higher AUC represents a better predictor (Hanley and McNeil, 1982;Liu et al, 2018;Feng et al, 2019;Lai et al, 2019).…”
Section: Performance Indicatorsmentioning
confidence: 99%
“…To identify potential lncRNA biomarkers with ceRNA activity, a random forest approach and leave-one-out cross-validation (LOOCV) were used to select optimal lncRNAs biomarkers using the R package "randomForest" and out-of-bag (OOB) error, which measure the performance of the model on the training set (Lv et al, 2019;Tan et al, 2019). The OOB error will produce an unbiased estimate for the classification error, while the bagging method will decrease the chance of overfitting (Toth et al, 2019).…”
Section: Identification Of Lncrna Biomarkers With Cerna Activity Usinmentioning
confidence: 99%
“…In this study, m5C data of three species have been collected from recently published literature. For A. thaliana, same datasets constructed by Lv et al [26] were used for fair comparison. The positive RNA segments which contain m5C site in the center were collected from NCBI Gene Expression Omnibus (GEO) database with the accession number GSE94065 [29].…”
Section: Benchmark Datasetsmentioning
confidence: 99%
“…The pseudo K-tuple nucleotide composition (PseKNC) has been used to represent an RNA sequence with a discrete model or vector which can keep considerable sequence order information, especially the global or long-range sequence order information [20,26,38,39]. In this study, we used PseDNC (K=2 for PseKNC) to encode the RNA segments.…”
Section: Pseudo Dinucleotide Composition (Psednc)mentioning
confidence: 99%
See 1 more Smart Citation