2019
DOI: 10.1101/864363
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deep exploration networks for rapid engineering of functional DNA sequences

Abstract: Engineering gene sequences with defined functional properties is a major goal of synthetic biology. Deep neural network models, together with gradient ascent-style optimization, show promise for sequence generation. The generated sequences can however get stuck in local minima, have low diversity and their fitness depends heavily on initialization. Here, we develop deep exploration networks (DENs), a type of generative model tailor-made for searching a sequence space to minimize the cost of a neural network fi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 34 publications
0
5
0
Order By: Relevance
“…What is more, we operate on short sequences constituting minuscule fractions of the whole genome with all its complexity. Although successful deep learning approaches for both protein (Brookes et al, 2019; Alley et al, 2019; Biswas et al, 2020) and regulatory sequence design (Gupta & Zou, 2019; Gupta & Kundaje, 2019; Linder et al, 2019; Schreiber et al, 2020) do exist, moving from read-based classification to genome-wide phenotype optimization would require considerable research effort, if possible at all. This would entail capturing a wealth of biological contexts well beyond the capabilities of even the best classification models currently available.…”
Section: Discussionmentioning
confidence: 99%
“…What is more, we operate on short sequences constituting minuscule fractions of the whole genome with all its complexity. Although successful deep learning approaches for both protein (Brookes et al, 2019; Alley et al, 2019; Biswas et al, 2020) and regulatory sequence design (Gupta & Zou, 2019; Gupta & Kundaje, 2019; Linder et al, 2019; Schreiber et al, 2020) do exist, moving from read-based classification to genome-wide phenotype optimization would require considerable research effort, if possible at all. This would entail capturing a wealth of biological contexts well beyond the capabilities of even the best classification models currently available.…”
Section: Discussionmentioning
confidence: 99%
“…One approach to overcome the issue of observational data is to perform massively parallel reporter assays (MPRA) for different cell types. MPRA for human splicing have been performed in HEK293 cells [25,27,[56][57][58], K562 cells [59,60], HepG2 cells [59], and HELA and MCF7 cells [61]. These data provide powerful resources to train complex models on splicing, but tissue and cell-type diversity is still lacking.…”
Section: Discussionmentioning
confidence: 99%
“…The original GAN and the analyzer were pretrained independently before being linked through a feedback loop: At each epoch, the generated sequences scored by the analyzer as most desirable were fed back to the discriminator as real examples, gradually replacing the training set of real genes and guiding the sequence generation toward the target. Similar generative models showed promising results for creating novel promoter regions, protein-binding motifs, proteincoding sequences, sequences with antimicrobial properties, and even whole regulatory structures (e.g., promoter, 5 UTR, 3 UTR, terminator) with desired expression levels (13,(30)(31)(32)(33)(34).…”
Section: Applications In Functional Genomicsmentioning
confidence: 98%