2020
DOI: 10.1101/2020.01.20.912451
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

EnhancerP-2L: A Gene regulatory site identification tool for DNA enhancer region using CREs motifs

Abstract: Enhancers are DNA fragments that do not encode RNA molecules and proteins, but they act critically in the production of RNAs and proteins by controlling gene expression. Prediction of enhancers and their strength plays significant role in regulating gene expression. Prediction of enhancer regions, in sequences of DNA, is considered a difficult task due to the fact that they are not close to the target gene, have less common motifs and are mostly tissue/cell specific. In recent past, several bioinformatics tool… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(11 citation statements)
references
References 81 publications
0
11
0
Order By: Relevance
“…In this case, the overall indices could be used, such as ACC and MCC. In the 5-fold cross-validation, Enhancer-LSTMAtt was superior to Enhancer-BERT [ 55 ], DeployEnhancer [ 48 ] and iEnhancer-RF [ 57 ] in terms of ACC and MCC in the first stage and exceeded iEnhancer-PsedeKNC [ 41 ], DeployEnhancer [ 48 ], EnhancerP-2L [ 51 ], and iEnhancer-RF [ 57 ] in terms of MCC in the second stage. In the 10-fold cross-validation, Enhancer-LSTMAtt reached competitive performance with ES-ARCNN [ 49 ], iEnhancer-XG [ 53 ], and iEnhancer-MFGBDT [ 63 ] in the second stage.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…In this case, the overall indices could be used, such as ACC and MCC. In the 5-fold cross-validation, Enhancer-LSTMAtt was superior to Enhancer-BERT [ 55 ], DeployEnhancer [ 48 ] and iEnhancer-RF [ 57 ] in terms of ACC and MCC in the first stage and exceeded iEnhancer-PsedeKNC [ 41 ], DeployEnhancer [ 48 ], EnhancerP-2L [ 51 ], and iEnhancer-RF [ 57 ] in terms of MCC in the second stage. In the 10-fold cross-validation, Enhancer-LSTMAtt reached competitive performance with ES-ARCNN [ 49 ], iEnhancer-XG [ 53 ], and iEnhancer-MFGBDT [ 63 ] in the second stage.…”
Section: Resultsmentioning
confidence: 99%
“…For fair comparison with the state-of-the-art methods, we used the same benchmark dataset as those in iEnhancer-2L [ 40 ], iEnhancer-PsedeKNC [ 41 ], EnhancerPred [ 42 ], EnhancerPred2.0 [ 43 ], Enhancer-Tri-N [ 44 ], iEnhaner-2L-Hybrid [ 45 ], iEnhancer-EL [ 46 ], iEnhancer-5Step [ 47 ], DeployEnhancer [ 48 ], ES-ARCNN [ 49 ], iEnhancer-ECNN [ 50 ], EnhancerP-2L [ 51 ], iEnhancer-CNN [ 52 ], iEnhancer-XG [ 53 ], Enhancer-DRRNN [ 54 ], Enhancer-BERT [ 55 ], iEnhancer-KL [ 56 ], iEnhancer-RF [ 57 ], spEnhancer [ 58 ], iEnhancer-EBLSTM [ 59 ], iEnhancer-GAN [ 60 ], piEnPred [ 61 ], iEnhancer-RD [ 62 ], and iEnhancer-MFGBDT [ 63 ]. The dataset was initially collected by Liu et al [ 40 ] from chromatin state information of nine cell lines (H1ES, K562,GM12878, HepG2, HUVEC, HSMM, NHLF, NHEK and HME) which was annotated by ChromHMM [ 69 , 70 ].…”
Section: Datamentioning
confidence: 99%
See 1 more Smart Citation
“…Specifically, when designing polyadenylation signals based on APARENT, we validated the designs using DeeReCT-APA [31], an LSTM trained on 3'-sequencing data of mouse cells, and DeepPASTA [30], a CNN trained on human 3'-sequencing data. When designing enhancer sequences, we validated the designs using iEnhancer-ECNN [62], an ensemble of CNNs trained on genomic enhancer sequences, and EnhancerP-2L [63], a Random Forest-classifier based on statistical features extracted from enhancer regions in the genome. Finally, to validate Optimus 5' designs, we had access to a newer version of the model that had been trained on additional MPRA data, making it more robust particularly on outlier sequences such as long homopolymer stretches [25].…”
Section: Regularized Sequence Designmentioning
confidence: 99%
“…EnhancerP-2L [63] Detects genomic enhancer regions and predicts whether it is a weak or strong enhancer. For a sample of generated sequences per design method, we calculated the mean detect/not detect prediction rate, the mean weak/strong prediction rate and the mean p-score.…”
Section: Validation Modelsmentioning
confidence: 99%