2023
DOI: 10.1016/j.cels.2023.07.003
|View full text |Cite
|
Sign up to set email alerts
|

Learning protein fitness landscapes with deep mutational scanning data from multiple sources

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 61 publications
0
6
0
Order By: Relevance
“…Regime extrapolation tests a model’s ability to predict how mutations combine by training on single amino acid substitutions and predicting the effects of multiple substitutions [18, 19, 21, 22] (Figs. 3c and S4).…”
Section: Resultsmentioning
confidence: 99%
“…Regime extrapolation tests a model’s ability to predict how mutations combine by training on single amino acid substitutions and predicting the effects of multiple substitutions [18, 19, 21, 22] (Figs. 3c and S4).…”
Section: Resultsmentioning
confidence: 99%
“…types at an aggregate level (Strokach et al, 2021;Høie et al, 2022;Cagiada et al, 2023;Nguyen and Hy, 2023), although some results suggest that a richer representation might be learned by combining multiple data types at the input level (Mansoor et al, 2021;Wu et al, 2023;Wang et al, 2022;Yang et al, 2022;Chen et al, 2023;Cheng et al, 2023;Zhang et al, 2023).…”
Section: Gnnmentioning
confidence: 99%
“…Examples of the types of data used as input include the wild-type amino acid sequence ( Lin et al, 2022; Brandes et al, 2022 ), a multiple sequence alignment (MSA) ( Ng and Henikoff, 2001; Balakrishnan et al, 2011; Lui and Tiana, 2013; Nielsen et al, 2017; Hopf et al, 2017; Riesselman et al, 2018; Laine et al, 2019 ) or the protein structure ( Boomsma and Frellsen, 2017; Jing et al, 2021a; Hsu et al, 2022 ). Some methods have combined predictions from multiple protein data types at an aggregate level ( Strokach et al, 2021; Høie et al, 2022; Cagiada et al, 2023; Nguyen and Hy, 2023 ), although some results suggest that a richer representation might be learned by combining multiple data types at the input level ( Mansoor et al, 2021; Wu et al, 2023; Wang et al, 2022; Yang et al, 2022; Chen et al, 2023; Cheng et al, 2023; Zhang et al, 2023 ).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…These vector representations capture the evolutionary and biophysical features of protein sequences in the representation model's latent embedding space 29,30 . Indeed, PLMs have been used to learn the structure of fitness landscapes [33][34][35] , and can learn evolutionary features when trained with ancestrally reconstructed sequence data 36 .…”
Section: Introductionmentioning
confidence: 99%