Dissection of core promoter syntax through single nucleotide resolution modeling of transcription initiation

He, Adam Y; Danko, Charles G

doi:10.1101/2024.03.13.583868

Cited by 1 publication

(2 citation statements)

References 99 publications

(177 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For each gene, we fetched an individual's two 49-kilobase (kb) consensus sequences centered on the gene's TSS (GENCODE 30 v26). We one-hot encoded each sequence, and used the average of two one-hot encoded matrices as our input 24 . While other approaches are possible 7 , we found this representation to be reasonable because:…”

Section: Inputsmentioning

confidence: 99%

“…Previous work used far fewer individuals and did not evaluate across them. 7,24 To address this we developed Performer, a fine-tuning strategy that implements cross-individual training and evaluation of sequence-to-expression neural network models. Briefly, we modified the Enformer architecture 15 by replacing the output head with one that predicts tissue-specific gene expression as a scalar value rather than a genomic track and implemented fine-tuning with Enformer's weights as starting values for the parameters in the model trunk and a custom loss function (Methods).…”

mentioning

confidence: 99%

See 1 more Smart Citation

Deep-learning prediction of gene expression from personal genomes

Drusinsky,

Whalen,

Pollard

2024

Preprint

View full text Add to dashboard Cite

Models that predict RNA levels from DNA sequences show tremendous promise for decoding tissue-specific gene regulatory mechanisms, revealing the genetic architecture of traits, and interpreting noncoding genetic variation. Existing methods take two different approaches: 1) associating expression with linear combinations of common genetic variants (training across individuals on single genes), or 2) learning genome-wide sequence-to-expression rules with neural networks (training across loci using a reference genome). Since limitations of both strategies have been highlighted recently, we sought to combine the sequence context provided by deep learning with the information provided by cross-individual training. We utilized fine-tuning to develop Performer, a model with accuracy approaching the cis-heritability of most genes. Performer prioritizes genetic variants across the allele frequency spectrum that disrupt motifs, fall in annotated regulatory elements, and have functional evidence for modulating gene expression. While obstacles remain in personalized expression prediction, our findings establish deep learning as a viable strategy.

show abstract

Section: Inputsmentioning

confidence: 99%

mentioning

confidence: 99%