High-throughput computational pipeline for 3-D structure preparation and in silico protein surface property screening: A case study on HBcAg dimer structures

Klijn, Marieke E.; Vormittag, Philipp; Bluthardt, Nicolai; Hubbuch, Jürgen

doi:10.1016/j.ijpharm.2019.03.057

Cited by 2 publications

(2 citation statements)

References 81 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…3-D structurebased methods include the prediction of soluble expression by molecular dynamics (MD)-simulated unfolding combined with a support vector machine (SVM) architecture (Schaller et al, 2015), dynamic exposure of hydrophobic patches in MD simulations (Chennamsetty et al, 2009;Jamroz et al, 2014), and projection of sequence-based methods onto 3-D structures (Sormanni et al, 2015;Zambrano et al, 2015). Although high-throughput 3-D structure generation of VLP building blocks has been described previously (Klijn et al, 2019), the computational cost of creating 3-D structures is still high, limiting the applicability of this approach in candidate selection for several hundred molecules. Amino acid sequence-based methods can be distinguished into amino acid composition-based algorithms such as machine learning approaches using SVM or random forest classifiers (Magnan et al, 2009;Agostini et al, 2012;Samak et al, 2012;Xiaohui et al, 2014;Yang et al, 2016) and sliding-window-based algorithms, such as AGGRESCAN, Zyggregator, and CamSol (Conchillo-Sole et al, 2007;Tartaglia et al, 2008;Sormanni et al, 2015).…”

Section: Introductionmentioning

confidence: 99%

Ensembles of Hydrophobicity Scales as Potent Classifiers for Chimeric Virus-Like Particle Solubility – An Amino Acid Sequence-Based Machine Learning Approach

Vormittag

Klamp

Hubbuch

2020

Front. Bioeng. Biotechnol.

Self Cite

View full text Add to dashboard Cite

Section: Introductionmentioning

confidence: 99%

Ensembles of Hydrophobicity Scales as Potent Classifiers for Chimeric Virus-Like Particle Solubility – An Amino Acid Sequence-Based Machine Learning Approach

Vormittag

Klamp

Hubbuch

2020

Front. Bioeng. Biotechnol.

Self Cite

View full text Add to dashboard Cite

“…Based on zeta potential measurements, this behavior could be related to surface charge. In another study, a high-throughput 3-D structure generation workflow was developed that we applied on exactly these three constructs in their disassembled form to calculate a surface charge that correlated well with the zeta potential measurements ( Klijn et al, 2019 ). This is a good example of in silico representations of physicochemical properties, which pave the way for model-assisted rather than empirical process development.…”

Section: Introductionmentioning

confidence: 99%

Optimization of a Soft Ensemble Vote Classifier for the Prediction of Chimeric Virus-Like Particle Solubility and Other Biophysical Properties

Vormittag

Klamp

Hubbuch

2020

Front. Bioeng. Biotechnol.

Self Cite

View full text Add to dashboard Cite

Chimeric virus-like particles (cVLPs) are protein-based nanostructures applied as investigational vaccines against infectious diseases, cancer, and immunological disorders. Low solubility of cVLP vaccine candidates is a challenge that can prevent development of these very substances. Solubility of cVLPs is typically assessed empirically, leading to high time and material requirements. Prediction of cVLP solubility in silico can aid in reducing this effort. Protein aggregation by hydrophobic interaction is an important factor driving protein insolubility. In this article, a recently developed soft ensemble vote classifier (sEVC) for the prediction of cVLP solubility was used based on 91 literature amino acid hydrophobicity scales. Optimization algorithms were developed to boost model performance, and the model was redesigned as a regression tool for ammonium sulfate concentration required for cVLP precipitation. The present dataset consists of 568 cVLPs, created by insertion of 71 different peptide sequences using eight different insertion strategies. Two optimization algorithms were developed that (I) modified the sEVC with regard to systematic misclassification based on the different insertion strategies, and (II) modified the amino acid hydrophobicity scale tables to improve classification. The second algorithm was additionally used to synthesize scales from random vectors. Compared to the unmodified model, Matthew's Correlation Coefficient (MCC), and accuracy of the test set predictions could be elevated from 0.63 and 0.81 to 0.77 and 0.88, respectively, for the best models. This improved performance compared to literature scales was suggested to be due to a decreased correlation between synthesized scales. In these, tryptophan was identified as the most hydrophobic amino acid, i.e., the amino acid most problematic for cVLP solubility, supported by previous literature findings. As a case study, the sEVC was redesigned as a regression tool and applied to determine ammonium sulfate concentrations for the

show abstract

High-throughput computational pipeline for 3-D structure preparation and in silico protein surface property screening: A case study on HBcAg dimer structures

Cited by 2 publications

References 81 publications

Ensembles of Hydrophobicity Scales as Potent Classifiers for Chimeric Virus-Like Particle Solubility – An Amino Acid Sequence-Based Machine Learning Approach

Ensembles of Hydrophobicity Scales as Potent Classifiers for Chimeric Virus-Like Particle Solubility – An Amino Acid Sequence-Based Machine Learning Approach

Optimization of a Soft Ensemble Vote Classifier for the Prediction of Chimeric Virus-Like Particle Solubility and Other Biophysical Properties

Contact Info

Product

Resources

About