2018
DOI: 10.1007/978-1-4939-8775-7_10
|View full text |Cite
|
Sign up to set email alerts
|

CRM Discovery Beyond Model Insects

Abstract: Although the number of sequenced insect genomes numbers in the hundreds, little is known about gene regulatory sequences in any species other than the well-studied Drosophila melanogaster. We provide here a detailed protocol for using SCRMshaw, a computational method for predicting cis-regulatory modules (CRMs, also “enhancers”) in sequenced insect genomes. SCRMshaw is effective for CRM discovery throughout the range of holometabolous insects and potentially in even more diverged species, with true-positive pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
16
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2

Relationship

4
1

Authors

Journals

citations
Cited by 11 publications
(17 citation statements)
references
References 29 publications
0
16
0
Order By: Relevance
“…We previously developed an effective method for computational CRM discovery, SCRMshaw (for S upervised c is - R egulatory M odule discovery) [79]. SCRMshaw uses a training set composed of known CRMs defined by a common functional characterization (e.g., “nervous system,” “midgut”) to build a statistical model that captures their short DNA subsequence ( k -mer) count distribution and compares it to that of a set of non-CRM “background” sequences in a machine-learning framework.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…We previously developed an effective method for computational CRM discovery, SCRMshaw (for S upervised c is - R egulatory M odule discovery) [79]. SCRMshaw uses a training set composed of known CRMs defined by a common functional characterization (e.g., “nervous system,” “midgut”) to build a statistical model that captures their short DNA subsequence ( k -mer) count distribution and compares it to that of a set of non-CRM “background” sequences in a machine-learning framework.…”
Section: Resultsmentioning
confidence: 99%
“…4a-c). Note that this is equivalent to simply changing the default shift size parameter from 250 bp to 10 bp and following the basic SCRMshaw protocol (e.g., as described in [9]), which allows for execution on single processor. However, the latter approach would significantly boost the execution time, particularly for a large genome.
Fig.
…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, approaches that rely solely on genome sequence are likely to be the most appealing to researchers that use non-traditional insect models. A number of these are available, most of which fall into the 'specific' enhancer discovery class Kazemian and Halfon, 2019;Le et al, 2019;Liu et al, 2018). In general, these approaches deconstruct the training sequences into a set of small (e.g.…”
Section: Enhancer Prediction Independent Of Experimentally Derived Fementioning
confidence: 99%
“…4-8 nucleotides) subsequences, or 'k-mers', which are then evaluated against a similarly deconstructed set of nonenhancer background sequences. With the notable exception of SCRMshaw (Kantorovitz et al, 2009;Kazemian and Halfon, 2019;Kazemian et al, 2011Kazemian et al, , 2014, most such approaches have not been tested with respect to insect genomes, including that of Drosophila (a somewhat ironic situation given the unmatched availability of empirically confirmed Drosophila enhancers for use as training data; Rivera et al, 2019). Although methods demonstrated to work using vertebrate genomes are expected to function equally well in insects, comparing efficacies is difficult given the different training and validation regimens applied.…”
Section: Enhancer Prediction Independent Of Experimentally Derived Fementioning
confidence: 99%