2021
DOI: 10.1101/2021.02.17.430503
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A comprehensive fitness landscape model reveals the evolutionary history and future evolvability of eukaryoticcis-regulatory DNA sequences

Abstract: Mutations in non-coding cis-regulatory DNA sequences can alter gene expression, organismal phenotype, and fitness. Fitness landscapes, which map DNA sequence to organismal fitness, are a long-standing goal in biology, but have remained elusive because it is challenging to generalize accurately to the vast space of possible sequences using models built on measurements from a limited number of endogenous regulatory sequences. Here, we construct a sequence-to-expression model for such a landscape and use it to de… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(9 citation statements)
references
References 108 publications
(128 reference statements)
0
9
0
Order By: Relevance
“…a pre-specified loss function) (LeCun, Bengio and Hinton, 2015). Current approaches to infer sequence function using neural networks employ regression tasks, where models learn to predict expert annotations or largescale measurements (Alipanahi et al, 2015;Avsec et al, 2021;Dhaval Vaishnav et al, 2021). For example, training genomic sequence models on labels representing the presence or absence of transcription factor binding leads to the model learning features that directly correspond with the consensus motifs for these transcription factors (Koo and Eddy, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…a pre-specified loss function) (LeCun, Bengio and Hinton, 2015). Current approaches to infer sequence function using neural networks employ regression tasks, where models learn to predict expert annotations or largescale measurements (Alipanahi et al, 2015;Avsec et al, 2021;Dhaval Vaishnav et al, 2021). For example, training genomic sequence models on labels representing the presence or absence of transcription factor binding leads to the model learning features that directly correspond with the consensus motifs for these transcription factors (Koo and Eddy, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…The segmentation step produced the cell-by-gene matrices that were used to assign the spatially-resolved cell types to scRNA-seq reference cell types using cell type matching algorithms. Teams of the SpaceTx Consortium explored six computational algorithms (map.cells* [1], mfishtools [24], fitness landscape model (FLM) [25], FR-Match [26, 27], Tangram [28], and pciSeq [29]), which produced individual cell type assignments with various probabilistic assignment scores. To arrive at consensus cell type assignments, two meta-analysis strategies were developed to combine the individual assignments more qualitatively (Negative Weighting Combining Strategy, hereinafter NWCS), or more quantitatively (Geometric Mean Combining Strategy, hereinafter GMCS) ( Methods ).…”
Section: Resultsmentioning
confidence: 99%
“…Six cell type matching algorithms (map.cells* [1], mfishtools [24], fitness landscape model (FLM) [25], FR-Match [26, 27], Tangram [28], and pciSeq [29]) were applied to assign reference scRNA-seq cell types to each segmented cell with an associated confidence score (a.k.a. probabilistic assignment) based on the cell-by-gene count matrix ( Method ) produced for the MERFISH data.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“… Vaishnav et al (2021) trained a deep transformer network to predict the gene expression level associated with 20 million randomly sampled 80-bp long DNA sequences introduced in a Saccharomyces cervisiae promoter region. They assessed the effect of all single mutations in promoter regions and discovered four evolvability archetypes: robust promoters on which mutations have little effect, plastic promoters on which every mutations have a small effect and minimal or maximal promoters on which only some mutations can dramatically decrease or increase the associated expression level.…”
Section: Survey Methodologymentioning
confidence: 99%