Distance geometry and protein loop modeling

Labiak, Rodrigo; Souza, Michael

doi:10.1002/jcc.26796

Cited by 2 publications

(1 citation statement)

References 43 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Unfortunately, to our best knowledge, there is a lack of systematic evaluations on the predictive performance of existing methods on loop regions, and no benchmark datasets containing large and diverse data have been made available thus far. Existing datasets suffer from three key shortcomings: (1) they require updating [ 66 ], with most test datasets proposed over a decade ago [ 47 , 67 , 68 ], (2) longer loops, especially those exceeding 15 residues, are often ignored [ 69 , 70 ] and (3) the data coverage and volume are limited, consisting of only ~100 samples and a few protein types [ 42 , 52 , 71 ]. Therefore, evaluations based on these datasets may not adequately reflect actual model performance.…”

Section: Introductionmentioning

confidence: 99%

Comprehensive assessment of protein loop modeling programs on large-scale datasets: prediction accuracy and efficiency

Wang,

Zhang

et al. 2023

Briefings in Bioinformatics

View full text Add to dashboard Cite

Protein loops play a critical role in the dynamics of proteins and are essential for numerous biological functions, and various computational approaches to loop modeling have been proposed over the past decades. However, a comprehensive understanding of the strengths and weaknesses of each method is lacking. In this work, we constructed two high-quality datasets (i.e. the General dataset and the CASP dataset) and systematically evaluated the accuracy and efficiency of 13 commonly used loop modeling approaches from the perspective of loop lengths, protein classes and residue types. The results indicate that the knowledge-based method FREAD generally outperforms the other tested programs in most cases, but encountered challenges when predicting loops longer than 15 and 30 residues on the CASP and General datasets, respectively. The ab initio method Rosetta NGK demonstrated exceptional modeling accuracy for short loops with four to eight residues and achieved the highest success rate on the CASP dataset. The well-known AlphaFold2 and RoseTTAFold require more resources for better performance, but they exhibit promise for predicting loops longer than 16 and 30 residues in the CASP and General datasets. These observations can provide valuable insights for selecting suitable methods for specific loop modeling tasks and contribute to future advancements in the field.

show abstract