2019
DOI: 10.1186/s12859-019-2781-x
|View full text |Cite
|
Sign up to set email alerts
|

Computational enhancer prediction: evaluation and improvements

Abstract: Background Identifying transcriptional enhancers and other cis -regulatory modules (CRMs) is an important goal of post-sequencing genome annotation. Computational approaches provide a useful complement to empirical methods for CRM discovery, but it is critical that we develop effective means to evaluate their performance in terms of estimating their sensitivity and specificity. Results We introduce here pCRMeval , a … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
31
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2

Relationship

5
3

Authors

Journals

citations
Cited by 17 publications
(31 citation statements)
references
References 19 publications
0
31
0
Order By: Relevance
“…(Not all of the 11 lines could be evaluated for each criterion, either due to possible vector‐dependent expression or because expression of orthologs is not known; the range for predicted expression matching represents the lower/upper bounds depending on whether observed expression is enhancer‐driven or vector‐driven.) Although these numbers are moderately weaker than what we have observed in previous applications of SCRMshaw (Kantorovitz et al ., 2009; Kazemian et al ., 2011, 2014), they are well above random expectation (Asma and Halfon, 2019). SCRMshaw performance is highly dependent on the training data used, and a relatively narrow selection of training sets were used here.…”
Section: Resultsmentioning
confidence: 78%
“…(Not all of the 11 lines could be evaluated for each criterion, either due to possible vector‐dependent expression or because expression of orthologs is not known; the range for predicted expression matching represents the lower/upper bounds depending on whether observed expression is enhancer‐driven or vector‐driven.) Although these numbers are moderately weaker than what we have observed in previous applications of SCRMshaw (Kantorovitz et al ., 2009; Kazemian et al ., 2011, 2014), they are well above random expectation (Asma and Halfon, 2019). SCRMshaw performance is highly dependent on the training data used, and a relatively narrow selection of training sets were used here.…”
Section: Resultsmentioning
confidence: 78%
“…STARR-seq, which elegantly converts CRMs into their own reporters by cloning them downstream of a core promoter and sequencing the output, is one increasingly popular method [ 14 , 16 , 17 , 18 , 19 , 20 ]. Although reporter-based methods have long been viewed as a gold standard, due to the fact that they provide a direct functional readout of regulatory activity, there is growing recognition that these methods can lead to both false-positive and false-negative results [ 7 , 8 , 21 ]. However, the overall accuracy of reporter gene assays is believed to be high, and these remain the most definitive assays for regulatory function.…”
Section: Empirical Approaches To Crm Discoverymentioning
confidence: 99%
“…REDfly has also played a dramatic role in developing methods for computational CRM discovery. Its extensive collection of experimentally verified CRMs provides a ready source of validation data for assessing CRM predictions and for comparing among methods [ 21 , 112 , 113 , 114 , 115 , 116 ]. Perhaps more importantly, REDfly’s advanced search and filtering features make it an unmatched source for compiling training data for machine-learning approaches [ 10 , 11 , 61 , 117 ].…”
Section: Redfly and Scrmshaw: Powerful Tools For Insect Regulatory Genomicsmentioning
confidence: 99%
“…Although methods demonstrated to work using vertebrate genomes are expected to function equally well in insects, comparing efficacies is difficult given the different training and validation regimens applied. An evaluation platform for assessing methods using a uniform set of Drosophila training and validation data has recently been described (Asma and Halfon, 2019), and a critical comparison of various approaches would be a valuable addition to the field.…”
Section: Enhancer Prediction Independent Of Experimentally Derived Fementioning
confidence: 99%
“…How effective is enhancer discovery when genome assemblies are highly incomplete? Testing SCRMshaw with simulated dis-assembly of the Drosophila genome has revealed that contig N50s of at least 23,000 bp (which encompasses the upper 50% of current insect assemblies) are sufficient for effective SCRMshaw prediction, with minor loss of sensitivity and negligible increase in false-positive rates (Asma and Halfon, 2019). Therefore, highly complete genome assembly does not appear to be a prerequisite for successful enhancer prediction by SCRMshaw.…”
Section: Integrating Experimental and Computational Approachesmentioning
confidence: 99%