2022
DOI: 10.1093/bioinformatics/btac135
|View full text |Cite
|
Sign up to set email alerts
|

fastISM: performantin silicosaturation mutagenesis for convolutional neural networks

Abstract: Motivation Deep learning models such as convolutional neural networks are able to accurately map biological sequences to associated functional readouts and properties by learning predictive de novo representations. In-silico saturation mutagenesis (ISM) is a popular feature attribution technique for inferring contributions of all characters in an input sequence to the model’s predicted output. The main drawback of ISM is its runtime, as it involves multiple forward propagations of all possibl… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 13 publications
(6 citation statements)
references
References 25 publications
0
6
0
Order By: Relevance
“…Attribution methods quantify the importances of individual nucleotides in the input sequences using forward- ( e.g. in silico mutagenesis 6, 19 ) or back-propagation 20, 21 . These importance scores can be the basis for further clustering of activating sub-sequences into PWMs 22 , which, as with filter visualization, can in turn be compared to known TF motifs for biological insights.…”
Section: Introductionmentioning
confidence: 99%
“…Attribution methods quantify the importances of individual nucleotides in the input sequences using forward- ( e.g. in silico mutagenesis 6, 19 ) or back-propagation 20, 21 . These importance scores can be the basis for further clustering of activating sub-sequences into PWMs 22 , which, as with filter visualization, can in turn be compared to known TF motifs for biological insights.…”
Section: Introductionmentioning
confidence: 99%
“…TISM's strength especially comes through for long sequences (e.g., > 20kb), and therefore it is extremely useful to detect, extract, and compare regulatory motifs across sequences and tasks 4 . While not as accurate as FastISM 23 or Yuzu 24 (because these are not approximations), TISM, in contrast, is applicable to any network written in any code base, any number of sequences, and only requires a few lines of code to turn the model's gradient into TISM values.…”
Section: Discussionmentioning
confidence: 99%
“…Here, we show how one can very simply approximate ISM from the model’s gradient. Approximating ISM enables the analysis of both large sets of sequences and long sequences (e.g., >100kb) s. While not as accurate as FastISM 23 or Yuzu 24 (because these are not approximations), TISM is applicable to any type of network and only requires a few lines of code to turn the model’s gradient on the reference sequence into TISM, and also requires less computation time. We show that the majority of TISM (89%, >0.58) values correlates well above ISM values from different model initializations, suggesting that TISM is sufficient to understand the model’s learned regulatory grammar and predict effects of sequence variants across different loci.…”
Section: Discussionmentioning
confidence: 99%
“…By contrast, GOPHER provides a one-stop-shop for data processing of peak-based classification and quantitative regression analysis, training with data augmentations, and comprehensive model evaluation. GOPHER incorporates many popular model interpretability tools, such as first-layer filter visualization, global importance analysis, and attribution methods, including in silico mutagenesis 54,55 , saliency maps 56 , integrated gradients 57 , and SmoothGrad 58 .…”
Section: Discussionmentioning
confidence: 99%