The present investigations include utility of latest statistical algorithm Support Vector Machine (SVM) to identify non-linear structure activity relationship between IC50 values and structures of C-aryl glucoside SGLT2 inhibitors. Training dataset consisted of forty molecules and the remaining six molecules were chosen for test set validation. SVM under Gaussian Kernel Function yielded non-linear QSAR models. Forward selection algorithm was applied after pruning and redundancy check on molecular descriptors. Internal validations of QSAR models have been achieved using R CV (2) (LOO), PRESS, SDEP and Y-Scrambling. SVM aided non-linear models are more efficient when optimization of Gaussian Kernel Function was introduced. Non-linear QSAR studies further identified atomic van der Waals volumes, atomic masses, sum of geometrical distances between O..S and degree of unsaturation as molecular descriptors and crucial structural requirements to model IC50 of C-aryl glucoside derivatives.
Linear and non-linear QSAR studies have been performed in present investigation with multiple linear regressions (MLR) analysis and Support vector machine (SVM) using different kernels. Three relevant descriptors out of fifteen descriptors calculated are identified as LOGP values, G3e and Rte+. Their relationship with biological activity IC50 have provided structural insights in interpretation and serializing the results into a pragmatic approachable technique. QSAR models obtained show statistical fitness and good predictability. SVM using Gaussian kernel function was found more efficient in prediction of IC50 of training set of thirty small molecules HIV-1 capsid inhibitors. Y-scrambling, PRESS and test set were used as validation parameters. SVM was found superior to training set prediction and internal validations and found inferior to external test set (11 molecules) predictions. Wherein MLR analysis it was vice-versa. Mechanistic interpretation of selected descriptors from both the models actuates further research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.