2020
DOI: 10.1101/2020.10.26.354753
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Machine learning models for predicting protein condensate formation from sequence determinants and embeddings

Abstract: Intracellular phase separation of proteins into biomolecular condensates is increasingly recognised as an important phenomenon for cellular compartmentalisation and regulation of biological function. Different hypotheses about the parameters that determine the tendency of proteins to form condensates have been proposed with some of them probed experimentally through the use of constructs generated by sequence alterations. To broaden the scope of these observations, here, we established an in silico strategy fo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
8
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 39 publications
(55 reference statements)
0
8
0
Order By: Relevance
“…PSPer was based on empirical rules defined for a small class of proteins (FUS-like family) (Wang et al, 2018) because very little data about other LLPS proteins were available at the time. Other sequence-based LLPS predictors have also been developed afterwards, using Machine Learning (ML) classifiers such as Random Forest (Saar et al, 2020) or Support Vector Machines (Sun et al, 2019). A deeper analysis of the current state of the art LLPS predictors from sequence only is available in Vernon and Forman-Kay (2019) Recently, more data related to LLPS proteins have been gathered in publicly available databases (Mészáros et al, 2020;You et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…PSPer was based on empirical rules defined for a small class of proteins (FUS-like family) (Wang et al, 2018) because very little data about other LLPS proteins were available at the time. Other sequence-based LLPS predictors have also been developed afterwards, using Machine Learning (ML) classifiers such as Random Forest (Saar et al, 2020) or Support Vector Machines (Sun et al, 2019). A deeper analysis of the current state of the art LLPS predictors from sequence only is available in Vernon and Forman-Kay (2019) Recently, more data related to LLPS proteins have been gathered in publicly available databases (Mészáros et al, 2020;You et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
“…To the best of our knowledge, Droppler is the first model designed to predict the likelihood of proteins to undergo LLPS given a specific set of experimental conditions because existing approaches (Saar et al, 2020;Sun et al, 2019;Vernon and Forman-Kay, 2019;Orlando et al, 2019) focus only on the protein sequence and predict some sort of average LLPS propensity or in very specific experimental conditions. For instance, in Saar et al (2020) the authors only predict the LLPS propensity of a protein in "nearly physiological conditions".…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…This is because many different combinations of relevant interactions seem to be contributing to phase separation without anyone being universally necessary [12]. So far, however, with a few exceptions [13][14][15][16] mostly case-by-case studies of different sequences are performed, with the broader context of many findings, including their statistical significance remaining unknown.…”
Section: Introductionmentioning
confidence: 99%
“…As a consequence (with a few exceptions [13][14][15][16] ) mostly case-by-case studies of different sequences are performed, with the broader context of many ndings, including their statistical signi cance remaining unknown.…”
Section: Introductionmentioning
confidence: 99%