Traditional bioinformatics methods performed systematic comparison between the halophilic proteins and their non-halophilic homologues, to investigate the features related to hypersaline adaptation. Therefore, proposing some quantitative models to explain the sequence-characteristic relationship of halophilic proteins might shed new light on haloadaptation and help to design new biocatalysts adapt to high salt concentration. Five machine learning algorithm, including three linear and two non-linear methods were used to discriminate halophilic and their non-halophilic counterparts and the prediction accuracy was encouraging. The best prediction reliability for halophilic proteins was achieved by artificial neural network and support vector machine and reached 80 %, for non-halophilic proteins, it was achieved by linear regression and reached 100 %. Besides, the linear models have captured some clues for protein halo-stability. Among them, lower frequency of Ser in halophilic protein has not been report before.
BackgroundEffective and simple methods that lead to higher enzymatic efficiencies are highly sough. Here we proposed a foldon-triggered trimerization of the target enzymes with significantly improved catalytic performances by fusing a foldon domain at the C-terminus of the enzymes via elastin-like polypeptides (ELPs). The foldon domain comprises 27 residues and can forms trimers with high stability.ResultsLichenase and xylanase can hydrolyze lichenan and xylan to produce value added products and biofuels, and they have great potentials as biotechnological tools in various industrial applications. We took them as the examples and compared the kinetic parameters of the engineered trimeric enzymes to those of the monomeric and wild type ones. When compared with the monomeric ones, the catalytic efficiency (k
cat
/K
m) of the trimeric lichenase and xylanase increased 4.2- and 3.9- fold. The catalytic constant (k
cat) of the trimeric lichenase and xylanase increased 1.8- fold and 5.0- fold than their corresponding wild-type counterparts. Also, the specific activities of trimeric lichenase and xylanase increased by 149% and 94% than those of the monomeric ones. Besides, the recovery of the lichenase and xylanase activities increased by 12.4% and 6.1% during the purification process using ELPs as the non-chromatographic tag. The possible reason is the foldon domain can reduce the transition temperature of the ELPs.ConclusionThe trimeric lichenase and xylanase induced by foldon have advantages in the catalytic performances. Besides, they were easier to purify with increased purification fold and decreased the loss of activities compared to their corresponding monomeric ones. Trimerizing of the target enzymes triggered by the foldon domain could improve their activities and facilitate the purification, which represents a simple and effective enzyme-engineering tool. It should have exciting potentials both in industrial and laboratory scales.Electronic supplementary materialThe online version of this article (doi:10.1186/s12896-017-0380-3) contains supplementary material, which is available to authorized users.
Understanding of proteins adaptive to hypersaline environment and identifying them is a challenging task and would help to design stable proteins. Here, we have systematically analyzed the normalized amino acid compositions of 2121 halophilic and 2400 non-halophilic proteins. The results showed that halophilic protein contained more Asp at the expense of Lys, Ile, Cys and Met, fewer small and hydrophobic residues, and showed a large excess of acidic over basic amino acids. Then, we introduce a support vector machine method to discriminate the halophilic and non-halophilic proteins, by using a novel Pearson VII universal function based kernel. In the three validation check methods, it achieved an overall accuracy of 97.7%, 91.7% and 86.9% and outperformed other machine learning algorithms. We also address the influence of protein size on prediction accuracy and found the worse performance for small size proteins might be some significant residues (Cys and Lys) were missing in the proteins.
Background: Support vector machine (SVM), a novel powerful machine learning technology, was used to develop the non-linear quantitative structure-property relationship (QSPR) model of the G/11 xylanase based on the amino acid composition. The uniform design (UD) method was applied to optimize the running parameters of SVM for the first time. Results: Results showed that the predicted optimum temperature of leave-one-out (LOO) cross-validation fitted the experimental optimum temperature very well, when the running parameter C, Ɛ, and γ was 50, 0.001 and 1.5, respectively. The average root-mean-square errors (RMSE) of the LOO cross-validation were 9.53ºC, while the RMSE of the back propagation neural network (BPNN), was 11.55ºC. The predictive ability of SVM is a minor improvement over BPNN, but it is superior to the reported method based on stepwise regression.Two experimental examples proved the validation of the model for predicting the optimal temperature of xylanase. Conclusion: The results indicated that UD might be an effective method to optimize the parameters of SVM, which could be used as an alternative powerful modeling tool for QSPR studies of xylanase.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.