By reducing amino acid alphabet, the protein complexity can be significantly simplified, which could improve computational efficiency, decrease information redundancy and reduce chance of overfitting. Although some reduced alphabets have been proposed, different classification rules could produce distinctive results for protein sequence analysis. Thus, it is urgent to construct a systematical frame for reduced alphabets. In this work, we constructed a comprehensive web server called RAACBook for protein sequence analysis and machine learning application by integrating reduction alphabets. The web server contains three parts: (i) 74 types of reduced amino acid alphabet were manually extracted to generate 673 reduced amino acid clusters (RAACs) for dealing with unique protein problems. It is easy for users to select desired RAACs from a multilayer browser tool. (ii) An online tool was developed to analyze primary sequence of protein. The tool could produce K-tuple reduced amino acid composition by defining three correlation parameters (K-tuple, g-gap, λ-correlation). The results are visualized as sequence alignment, mergence of RAA composition, feature distribution and logo of reduced sequence. (iii) The machine learning server is provided to train the model of protein classification based on K-tuple RAAC. The optimal model could be selected according to the evaluation indexes (ROC, AUC, MCC, etc.). In conclusion, RAACBook presents a powerful and user-friendly service in protein sequence analysis and computational proteomics. RAACBook can be freely available at http://bioinfor.imu.edu.cn/raacbook.
Database URL: http://bioinfor.imu.edu.cn/raacbook
GRL-02031 (1) is an HIV-1 protease (PR) inhibitor containing a novel P1′ (R)-aminomethyl-2-pyrrolidinone group. Crystal structures at resolutions of 1.25 to 1.55 Å were analyzed for complexes of 1 with the PR containing major drug resistant mutations, PRI47V, PRL76V, PRV82A and PRN88D. Mutations of I47V and V82A alter residues in the inhibitor-binding site, while L76V and N88D are distal mutations having no direct contact with the inhibitor. Substitution of a smaller amino acid in PRI47V and PRL76V, and the altered charge of PRN88D are associated with significant local structural changes compared to the wild-type PRWT, while substitution of alanine in PRV82A increases the size of the S1′ subsite. The P1′ pyrrolidinone group of 1 accommodates to these local changes by assuming two different conformations. Overall, the conformation and interactions of 1 with PR mutants resemble those of PRWT with similar inhibition constants in good agreement with the antiviral potency on multidrug resistant HIV-1.
Defensins as 1 of major classes of host defense peptides play a significant role in the innate immunity, which are extremely evolved in almost all living organisms. Developing high-throughput computational methods can accurately help in designing drugs or medical means to defense against pathogens. To take up such a challenge, an up-to-date server based on rigorous benchmark dataset, referred to as iDEF-PseRAAC, was designed for predicting the defensin family in this study. By extracting primary sequence compositions based on different types of reduced amino acid alphabet, it was calculated that the best overall accuracy of the selected feature subset was achieved to 92.38%. Therefore, we can conclude that the information provided by abundant types of amino acid reduction will provide efficient and rational methodology for defensin identification. And, a free online server is freely available for academic users at http://bioinfor.imu.edu.cn/idpf . We hold expectations that iDEF-PseRAAC may be a promising weapon for the function annotation about the defensins protein.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.