2022
DOI: 10.1101/2022.11.15.516532
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Cross-protein transfer learning substantially improves disease variant prediction

Abstract: Genetic variation in the human genome is a major determinant of individual disease risk, but the vast majority of missense variants have unknown etiological effects. Various computational strategies have been proposed to predict the effects of missense variants across the human proteome, using many different predictive signals. Here, we present a robust learning framework for leveraging functional assay data to construct computational predictors of disease variant effects. We train cross-protein transfer (CPT)… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 11 publications
(13 citation statements)
references
References 56 publications
0
13
0
Order By: Relevance
“…Raft-Seq enables the selection of live single cells with various indel mutations from the primary phenotyping, and several isogenic clones are available for SLC25A19 and ATAD3A upon request. Additionally, it is worth mentioning the advent of tools in the realm of machine learning-enabled variant prediction, including AlphaMissense and CPT 46,47 . While such tools excel in predicting substitution variant effects, they face limitations in their ability to predict the functional consequences of indel missense variants.…”
Section: Discussionmentioning
confidence: 99%
“…Raft-Seq enables the selection of live single cells with various indel mutations from the primary phenotyping, and several isogenic clones are available for SLC25A19 and ATAD3A upon request. Additionally, it is worth mentioning the advent of tools in the realm of machine learning-enabled variant prediction, including AlphaMissense and CPT 46,47 . While such tools excel in predicting substitution variant effects, they face limitations in their ability to predict the functional consequences of indel missense variants.…”
Section: Discussionmentioning
confidence: 99%
“…In the last couple of years, a lot of attention has been drawn to optimizing, ensembling, clustering, subsampling, and pairing alignments toward improving protein 3D models (Petti et al, 2023), generating multiple functional conformations (Wayment-Steele et al, 2022), and resolving interactomes (Bret et al, 2023;Bryant et al, 2022). In the context of disease variants calling, Jagota and co-authors recently showed that vertebrate alignments exhibit a strong signal that can be used to boost specificity (Jagota et al, 2022).…”
Section: Discussionmentioning
confidence: 99%
“…Indeed, we and others have demonstrated that large language models can be effectively tailored to predict the effects of variants on different alternative spliced isoforms. 67,68,70,140 Furthermore, these models can also be employed to functionalize short insertions and deletions (indels) with promising performance. 63 These recent results underscore the immense potential of large language models in the field of genomics, ultimately driving our understanding of genetic diseases and accelerating the discovery of potential therapies.…”
Section: Complex and Somatic Variantsmentioning
confidence: 99%