2022
DOI: 10.1101/2022.08.04.502750
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

LambdaPP: Fast and accessible protein-specific phenotype predictions

Abstract: The availability of accurate and fast Artificial Intelligence (AI) solutions predicting aspects of proteins are revolutionizing experimental and computational molecular biology. The webserver LambdaPP aspires to supersede PredictProtein, the first internet server making AI protein predictions available in 1992. Given a protein sequence as input, LambdaPP provides easily accessible visualizations of protein 3D structure, along with predictions at the protein level (GeneOntology, subcellular location), and the r… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
9
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
1

Relationship

5
1

Authors

Journals

citations
Cited by 6 publications
(9 citation statements)
references
References 95 publications
(203 reference statements)
0
9
0
Order By: Relevance
“…Due memory constraints, we removed proteins >430 residues, yielding 93,286 sequences for training. To avoid bias from overrepresented families, we clustered the training set with UniqueProt 50 at the default Hval=0. Following AlphaFold developers 4 , we trained on randomly picked samples from the resulting 11,580 clusters.…”
Section: Methodsmentioning
confidence: 99%
“…Due memory constraints, we removed proteins >430 residues, yielding 93,286 sequences for training. To avoid bias from overrepresented families, we clustered the training set with UniqueProt 50 at the default Hval=0. Following AlphaFold developers 4 , we trained on randomly picked samples from the resulting 11,580 clusters.…”
Section: Methodsmentioning
confidence: 99%
“…The proteins sharing the same domain composition will have exactly the same MSAs. To avoid such redundancy, we focused on a subset of 59 proteins extracted with an adjusted version of UniqueProt [29, 30]. Instead of PSI-BLAST we used MMseqs2 to improve runtime, and discarded alignments of less than 50 residues for pairs of sequences with at least 180 residues to prevent very short alignments from removing longer sequences.…”
Section: Methodsmentioning
confidence: 99%
“…Calculating the structure bonus values for all three ligand classes of all 1010 proteins and writing them to disk only took around 5 seconds, resulting in a runtime of around 5 milliseconds per protein. We excluded the time for computing the structure and binding predictions from the runtime analysis as with the release of the fourth version of AlphaFold Protein Structure Database (AFDB), 200 million protein structures are available [21] for download and tools like bio-embeddings [23] and LambdaPP [24] allow for fast and easy bindEmbed21DL predictions.…”
Section: Methodsmentioning
confidence: 99%