Predictions of millions of protein 3D structures are only a few clicks away since the release of AlphaFold2 results for entire data sets. However, many proteins have so-called intrinsically disordered regions (IDRs) that do not adopt unique structures in isolation. These IDRs are associated with several diseases, including Alzheimer's Disease. We showed that the absence of reliable AlphaFold2 predictions correlated only to a limited extent with IDRs. In contrast, many expert methods predict IDRs directly and reliably by combining complex machine learning models with expert-crafted input features and evolutionary information from multiple sequence alignments. Some of these input features are not always available and computationally expensive to generate, limiting their scalability. In this work, we present the novel prediction method SETH that predicts residue disorder from embeddings generated by the protein Language Model ProtT5, which explicitly only uses single sequences as input. Thereby SETH, a relatively shallow convolutional neural network, already outperformed much more complex state-of-the-art solutions while being much faster, allowing to create predictions for the human proteome in fewer than 30 minutes on a machine with one RTX A6000 GPU with 48GB RAM. Trained on a continuous disorder scale, our method captured subtle variations in disorder, thereby providing important information beyond the binary classification of other predictors. The new method is freely publicly available at: https://github.com/DagmarIlz/SETH.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.